问NUMBA -如何使用"cuda“目标在@guvectorize中生成随机数？
EN

Stack Overflow用户

提问于 2018-02-17 19:09:50

回答 1查看 1.1K关注 0票数 1

在这个(哑巴)例子中，我试图通过计算落入单位圆的(0，1) x (0，1)中随机选择的点的数量来计算π。

@guvectorize(['void(float64[:], int32, float64[:])'], '(n),()->(n)', target='cuda')
def guvec_compute_pi(arr, iters, res):
    n = arr.shape[0]
    for t in range(n):
        inside = 0
        for i in range(iters):
            x = np.random.random()
            y = np.random.random()
            if x ** 2 + y ** 2 <= 1.0:
               inside += 1
        res[t] = 4.0 * inside / iters

此异常在编译过程中弹出：

numba.errors.UntypedAttributeError: Failed at nopython (nopython frontend)
Unknown attribute 'random' of type Module(<module 'numpy.random' from '...'>)
File "scratch.py", line 34
[1] During: typing of get attribute at /.../scratch.py (34)

我天真地认为使用RNG描述的here可以解决这个问题。我修改后的代码如下所示：

@guvectorize(['void(float64[:], int32, float64[:])'], '(n),()->(n)', target='cuda')
def guvec_compute_pi(arr, iters, res):
    n = arr.shape[0]
    rng = create_xoroshiro128p_states(n, seed=1)
    for t in range(n):
        inside = 0
        for i in range(iters):
            x = xoroshiro128p_uniform_float64(rng, t)
            y = xoroshiro128p_uniform_float64(rng, t)
            if x ** 2 + y ** 2 <= 1.0:
                inside += 1
        res[t] = 4.0 * inside / iters

然而，一个类似的错误将会弹出：

numba.errors.TypingError: Failed at nopython (nopython frontend)
Untyped global name 'create_xoroshiro128p_states': cannot determine Numba type of <class 'function'>
File "scratch.py", line 28

当我尝试更改为target='parallel'时，无论是否使用nopython=True，使用numpy.random.random的原始代码都工作得很好。target='cuda'出现问题的原因是什么?有没有办法在@guvectorize-d块中获取随机数？

numba

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-04 03:24:59

函数create_xoroshiro128p_states打算在CPU上运行，如Numba文档中的此示例所示，重复如下：

from __future__ import print_function, absolute_import

from numba import cuda
from numba.cuda.random import create_xoroshiro128p_states, 
xoroshiro128p_uniform_float32
import numpy as np

@cuda.jit
def compute_pi(rng_states, iterations, out):
    """Find the maximum value in values and store in result[0]"""
    thread_id = cuda.grid(1)

    # Compute pi by drawing random (x, y) points and finding what
    # fraction lie inside a unit circle
    inside = 0
    for i in range(iterations):
        x = xoroshiro128p_uniform_float32(rng_states, thread_id)
        y = xoroshiro128p_uniform_float32(rng_states, thread_id)
        if x**2 + y**2 <= 1.0:
            inside += 1

    out[thread_id] = 4.0 * inside / iterations

threads_per_block = 64
blocks = 24
rng_states = create_xoroshiro128p_states(threads_per_block * blocks, seed=1)
out = np.zeros(threads_per_block * blocks, dtype=np.float32)

compute_pi[blocks, threads_per_block](rng_states, 10000, out)
print('pi:', out.mean())

它生成一个随机初始化数据数组，使GPU上的随机数生成独立于多个线程。这些数据最终到达了设备端，这有点令人困惑。但是它允许你将随机状态数据传递给你的GPU内核。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/48840654

复制

相似问题

问NUMBA -如何使用"cuda“目标在@guvectorize中生成随机数？
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NUMBA -如何使用"cuda“目标在@guvectorize中生成随机数？EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NUMBA -如何使用"cuda“目标在@guvectorize中生成随机数？
EN