文章/答案/技术大牛

发布

社区首页 >问答首页 >有什么方法可以将这个循环矢量化吗？

问有什么方法可以将这个循环矢量化吗？
EN

Stack Overflow用户

提问于 2022-09-20 14:59:08

回答 5查看 137关注 0票数 1

我试着模拟两个不同骰子的结果。一个死亡是公平的(即每个数字的概率是1/6)，而另一个则不是。

我有一个数字数组，有0和1的说法，每次都使用死，0是公平的，1是另一个。我想用结果来计算另一个numpy数组。为了完成此任务，我使用了以下代码：

def dice_simulator(dices : np.ndarray) -> np.ndarray:
  n = len(dices)
  results = np.zeros(n)
  i = 0
  for dice in np.nditer(dices):
    if dice:
      results[i] = rnd.choice(6, p = [1/12, 1/12, 1/12, 1/4, 1/4, 1/4]) + 1
    else:
      results[i] = rnd.choice(6) + 1
    i += 1
  return results

与程序的其他部分相比，这需要花费大量的时间，并且认为这是因为我正在遍历一个numpy数组，而不是使用操作的向量化。有人能帮我吗？

python

numpy

random

回答 5

Stack Overflow用户

回答已采纳

发布于 2022-09-20 15:14:56

这是正确的方法。

def dice_simulator(dices: np.array) -> np.array:
    return np.where(
        dices,
        rnd.choice(6, dices.shape, p = [1/12, 1/12, 1/12, 1/4, 1/4, 1/4]),
        rnd.choice(6, dices.shape)
    ) + 1

编辑:正如其他答案所指出的，这个答案会生成两个完全大小的随机数组，这可能是浪费的，您可以避免任何过度生成的方法之一是基于@Claudio答案，但是对于零生成，如下所示。

def dice_simulator_slices_improved(dices):
    if dices.dtype != bool:
        dices = dices.astype(bool) # because we will iterate over it 3 times.
    N = dices.shape[0]
    n_Ones  = np.count_nonzero(dices)
    n_zeros = N - n_Ones
    results = np.empty(dices.shape[0],dtype=float) # reserve output array
    results[np.logical_not(dices)] = np.random.choice([1,2,3,4,5,6], size=n_zeros)
    results[dices] = np.random.choice(
        [1,2,3,4,5,6], size=n_Ones, p=[1/12,1/12,1/12,1/4,1/4,1/4])
    return results

这通常是实现零过生成的最快方法，现在np.where和这个非过生成方法之间的区别取决于填充它时使用的两个数组，如果它们的计算非常简单，比如插入0和1，那么np.where的速度几乎是dices的5倍，因为它只在dices上迭代一次，但是如果生成的代价与带有p参数的np.random.choice一样昂贵，而这恰好是非常昂贵的，那么就不需要进行过度生成了。

票数 0

Stack Overflow用户

发布于 2022-09-20 16:04:52

答案已经给出矢量化，通过过度生成和抛出一些输出，这似乎是错误的。

此外，我将概括到任何数目的指示。

首先，您需要能够获得一个condlist：它是一个长度列表，它是一个dice数的列表，每个第一个元素都是一个包含True的布尔数组，其中应该使用第一个骰子：

dices_idxs = np.array([0, 1, 2])
dices_sequence = np.array([0, 1, 2, 2, 1, 1, 0])

condlist = np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))

print(condlist)

# [[ True False False False False False  True]
#  [False  True False False  True  True False]
#  [False False  True  True False False False]]

其次，您可以使用np.select概括@Ahmed给出的答案。

def dice_simulator_select(dices_sequence, dices_weights):
    faces = np.arange(1, 7)
    num_dices = len(dices_weights)
    dices_idxs = np.arange(num_dices)
    num_throws = len(dices_sequence)

    condlist = list(
        np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
    )
    choicelist = [
        RNG.choice(faces, size=num_throws, p=dices_weights[dice_idx])
        for dice_idx in range(num_dices)
    ]
    return np.select(condlist, choicelist)

但是它首先说明了这个问题，因为它过度生成，然后丢弃一些生成的值，考虑到随机性，这可能是有问题的。

更正确的方法是使用np.piecewise

def dice_simulator_piecewise(dices_sequence, dices_weights):
    faces = np.arange(1, 7)
    num_dices = len(dices_weights)
    dices_idxs = np.arange(num_dices)
    num_dices = len(dices_weights)

    condlist = list(
        np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
    )
    # note size=len(x) ensure no more sample than needed are generated
    funclist = [
        lambda x: RNG.choice(faces, size=len(x), p=dices_weights[int(x[0])])
    ] * num_dices


    return np.piecewise(dices_sequence, condlist, funclist)

您可以如下所示使用这些函数，并看到使用np.piecewise的正确函数甚至更快(在下面的情况下，速度要快20%)：

RNG = np.random.default_rng()

dices_weights = [
    None,  # uniform
    [1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4],
    None,
    [1 / 4, 1 / 4, 1 / 4, 1 / 12, 1 / 12, 1 / 12],
    None,
    [1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4],
]
num_dices = len(dices_weights)
num_throws = 1_000
dices_sequence = RNG.choice(np.arange(num_dices), size=num_throws)


%timeit dice_simulator_select(dices_sequence, dices_weights)
%timeit dice_simulator_piecewise(dices_sequence, dices_weights)

# 311 µs ± 5.94 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
# 240 µs ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

票数 1

Stack Overflow用户

发布于 2022-09-21 08:17:10

这是最快的方法，因为速度很重要(这是导致问题的问题)，因此也是迄今为止最好的解决方案。

def dice_simulator_slices(dices):
    results  = RNG.integers(1,high=6,endpoint=True,size=dices.shape[0])
    results[dices==1] = RNG.choice([1,2,3,4,5,6], 
        size=get_size(dices), p=[1/12,1/12,1/12,1/4,1/4,1/4])
    return results

这里是上述功能所需的导入：

import numpy as np
RNG = np.random.default_rng()
get_size = np.count_nonzero

现在，让我们将其他解决方案的时间与上面的内容进行比较：

dice_simulator_piecewise  SIZE = 100_000_000 : 5.483888
dice_simulator_add_arrays SIZE = 100_000_000 : 5.148283
dice_simulator_np_where   SIZE = 100_000_000 : 4.838409
dice_simulator_slices_gen SIZE = 100_000_000 : 3.437379
dice_simulator_slices     SIZE = 100_000_000 : 2.976977

以上结果也许可以证明，通过优化而不是过度生成可以减缓事情的速度，所以过度生成不一定是错误的。

我目前所了解的情况是(如Ahmed AEK在其答复中所述)，在权重不是零的情况下，随机选择的计算(注意，在numpy中，权重参数称为'p'，而不是‘权重’)是主要的速度瓶颈。

我的“通用片”解决方案有点慢，但仍然比其他建议的解决方案(请参阅上面的时间表)更快，它支持像“分段”解决方案这样的任意数量的数据：

def dice_simulator_slices_gen(arr_dice_nums, arr_dice_num_weight):
    faces = np.arange(1, 7)
    results = np.empty(arr_dice_nums.shape[0], dtype=np.int8)
    for dice_num, weight in enumerate(arr_dice_num_weight): 
        bln_slice = arr_dice_nums == dice_num
        no_throws = np.count_nonzero(bln_slice)
        if weight is None: 
            results[bln_slice]=RNG.integers(1,high=6,endpoint=True,size=no_throws)
        else: 
            results[bln_slice]=RNG.choice(faces,p=weight,size=no_throws)
    return results

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73788790

复制

相似问题

问有什么方法可以将这个循环矢量化吗？
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问有什么方法可以将这个循环矢量化吗？EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问有什么方法可以将这个循环矢量化吗？
EN