我有一个很小的一维NumPy数组,长度为100阶...
我想知道一个子数组出现的次数。假设数组的每个元素都是1或0。我想计算至少有30个0出现的实例是一行。
对于np.array([0,0,0,0,1,0,0,1,0,0,0])
,我想返回2
对于np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0])
,我想返回2
对于np.array([0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0])
,我想返回5
我尝试过强制转换为string并使用string.count()。这非常有效,但我需要一个更快的解决方案。我每分钟会点击这个函数数百万次。
目前,我正在循环数组,这很慢,但比转换为字符串快得多(4倍)(我知道转换为字符串的速度很慢,字符串操作也很慢……)
任何想法都将不胜感激。
就itertools建议而言:
我写了一个关于性能的小检查:
import numpy as np
import itertools
import time
def itertools_solution(full_array):
trueFalse = full_array == 0
count = [ sum( 1 for _ in group ) for key, group in itertools.groupby( trueFalse ) if key ]
above = [val for val in count if val >= 3]
return len(above)
def looping_solution(full_array):
total_count = 0
running_count = 0
for counter, val in enumerate(full_array):
if val == 0:
running_count += 1
if running_count == 3:
total_count += 1
else:
running_count = 0
return total_count
a = np.array([[0,0,0,0,2,0,0,5,0,0,0,5,5,5],
[0,0,0,0,0,1,1,0,1,4,0,0,4,4],
[0,0,1,1,0,0,4,4,4,0,4,0,0,1],
[3,2,2,3,3,0,0,3,2,6,6,6,0,0],
[0,1,4,5,0,4,0,0,0,5,0,2,1,0],
[0,0,3,6,6,6,0,0,0,2,2,3,3,6],
[2,0,0,2,5,5,5,0,0,0,5,0,0,0],
[1,3,0,0,1,3,3,6,6,0,0,4,6,0],
[5,5,5,0,0,2,2,2,5,0,0,0,2,2],
[6,6,6,0,0,0,6,0,3,3,3,0,0,3],
[4,4,0,4,4,0,0,1,0,1,1,1,0,0]]).flatten()
time_start = time.time()
for cnt in range(1000):
itertools_solution(a)
print('itertools took %f seconds' % (time.time() - time_start))
time_start = time.time()
for cnt in range(1000):
looping_solution(a)
print('looping took %f seconds' % (time.time() - time_start))
结果如下:
itertools循环耗时0.185000秒,循环耗时0.038001秒
不幸的是,它并没有解决我的性能问题。
https://stackoverflow.com/questions/51351195
复制相似问题