我想从数组中随机选择一个元素,但每个元素都有一个已知的选择概率。
所有机会加在一起(在数组中)总和为1。
你认为哪种算法是最快的,最适合进行大规模计算?
示例:
id => chance
array[
0 => 0.8
1 => 0.2
]
对于这个伪代码,所讨论的算法应该在多个调用上为id 0
上的一个元素在id 1
上统计返回四个元素。
发布于 2010-12-17 01:24:57
计算列表的离散累积密度函数(CDF) --或者简单地说,就是权重的累积和数组。然后在0和所有权重的总和之间生成一个随机数(在您的示例中可能是1),执行二进制搜索以在离散CDF数组中查找此随机数并获得与此条目对应的值--这是您的加权随机数。
发布于 2010-12-17 01:26:44
算法简单明了
rand_no = rand(0,1)
for each element in array
if(rand_num < element.probablity)
select and break
rand_num = rand_num - element.probability
发布于 2014-04-22 20:52:34
下面是Ruby的一个实现:
def weighted_rand(weights = {})
raise 'Probabilities must sum up to 1' unless weights.values.inject(&:+) == 1.0
raise 'Probabilities must not be negative' unless weights.values.all? { |p| p >= 0 }
# Do more sanity checks depending on the amount of trust in the software component using this method,
# e.g. don't allow duplicates, don't allow non-numeric values, etc.
# Ignore elements with probability 0
weights = weights.reject { |k, v| v == 0.0 } # e.g. => {"a"=>0.4, "b"=>0.4, "c"=>0.2}
# Accumulate probabilities and map them to a value
u = 0.0
ranges = weights.map { |v, p| [u += p, v] } # e.g. => [[0.4, "a"], [0.8, "b"], [1.0, "c"]]
# Generate a (pseudo-)random floating point number between 0.0(included) and 1.0(excluded)
u = rand # e.g. => 0.4651073966724186
# Find the first value that has an accumulated probability greater than the random number u
ranges.find { |p, v| p > u }.last # e.g. => "b"
end
使用方法:
weights = {'a' => 0.4, 'b' => 0.4, 'c' => 0.2, 'd' => 0.0}
weighted_rand weights
粗略估计会发生什么:
sample = 1000.times.map { weighted_rand weights }
sample.count('a') # 396
sample.count('b') # 406
sample.count('c') # 198
sample.count('d') # 0
https://stackoverflow.com/questions/4463561
复制相似问题