文章/答案/技术大牛

发布

社区首页 >问答首页 >编程嵌套的numba.cuda函数调用

问编程嵌套的numba.cuda函数调用
EN

Stack Overflow用户

提问于 2018-10-15 06:06:07

回答 1查看 770关注 0票数 2

Numba & CUDA新手在这里。我希望能够让一个numba.cuda函数以编程方式从设备调用另一个函数，而不必将任何数据传递回主机。例如，给定设置

from numba import cuda

@cuda.jit('int32(int32)', device=True)
def a(x):
    return x+1

@cuda.jit('int32(int32)', device=True)
def b(x):
    return 2*x

我希望能够定义一个复合内核函数，比如

@cuda.jit('void(int32, __device__, int32)')
def b_comp(x, inner, result):
    y = inner(x)
    result = b(y)

并成功获得

b_comp(1, a, result)
assert result == 4

理想情况下，我希望b_comp在编译后接受变化的函数参数，例如在上面的调用之后，仍然接受b_comp(1, b, result) --但是函数参数在编译时变得固定的解决方案仍然适用于我。

据我所知，CUDA似乎支持传递函数指针。This post认为numba.cuda没有这样的支持，但这篇文章并不令人信服，而且也已经有一年的历史了。supported Python in numba.cuda的页面没有提到函数指针支持。但是它链接到supported Python in numba页面，这清楚地表明numba.jit()确实支持函数作为参数，尽管它们在编译时被修复。如果numba.cuda.jit()做同样的事情，就像我上面说的那样，那就行了。在这种情况下，在为comp指定签名时，我应该如何声明变量类型？或者我可以使用numba.cuda.autojit()

如果numba不支持任何这种直接方法，那么元编程是一个合理的选择吗？例如，一旦我知道了python函数，我的脚本就可以创建一个包含inner函数的新脚本，然后应用numba.cuda.jit()，然后导入结果。这似乎很复杂，但这是我能想到的唯一其他numba-based选项。

如果numba根本不能做到这一点，或者至少在没有严重的排除的情况下不能做到这一点，我会很高兴给出一个给出一些细节的答案，再加上一个像“切换到PyCuda”这样的记录。

python

cuda

numba

回答 1

Stack Overflow用户

发布于 2018-10-15 09:43:47

以下是对我有效的方法：

通过直接调用字符串中组合函数的decorator

Creating并将其传递给exec

，

最初没有使用cuda.jit修饰我的函数，以便它们仍然拥有__name__ attribute
Getting __name__ attribute
现在将python应用于我的函数

确切的代码是：

from numba import cuda
import numpy as np


def a(x):
    return x+1

def b(x):
    return 2*x


# Here, pretend we've been passed the inner function and the outer function as arguments
inner_fun = a
outer_fun = b

# And pretend we have noooooo idea what functions these guys actually point to
inner_name = inner_fun.__name__
outer_name = outer_fun.__name__

# Now manually apply the decorator
a = cuda.jit('int32(int32)', device=True)(a)
b = cuda.jit('int32(int32)', device=True)(b)

# Now construct the definition string for the composition function, and exec it.
exec_string = '@cuda.jit(\'void(int32, int32[:])\')\n' \
              'def custom_comp(x, out_array):\n' \
              '    out_array[0]=' + outer_name + '(' + inner_name + '(x))\n'

exec(exec_string)

out_array = np.array([-1])
custom_comp(1, out_array)
print(out_array)

不出所料，输出为

[4]

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52807489

复制

相似问题

问编程嵌套的numba.cuda函数调用
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问编程嵌套的numba.cuda函数调用EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问编程嵌套的numba.cuda函数调用
EN