在numpy数组中查找大量满足条件的连续值?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (2)
  • 关注 (0)
  • 查看 (27)

我有一些音频数据加载在一个数字数组中,我希望通过找到无声部分来分割数据,即在某个时间段内音频幅度低于某一阈值的部分。

一个非常简单的方法是这样的:

values = ''.join(("1" if (abs(x) < SILENCE_THRESHOLD) else "0" for x in samples))
pattern = re.compile('1{%d,}'%int(MIN_SILENCE))                                                                           
for match in pattern.finditer(values):
   # code goes here

上面的代码查找至少有min的部分。

现在,很明显,上面的代码效率低得可怕,正则表达式也被滥用了。是否还有其他更有效的方法,但仍然产生同样简单和简短的代码?

提问于
用户回答回答于

我认为它应该比其他的选择更快。然而,它确实需要的内存是各种基于生成器的解决方案的两倍。只要您可以在内存中保存数据的单个临时副本(用于diff),以及与数据长度相同的布尔数组(每元素1位),那么它应该是非常有效的.

import numpy as np

def main():
    # Generate some random data
    x = np.cumsum(np.random.random(1000) - 0.5)
    condition = np.abs(x) < 1

    # Print the start and stop indicies of each region where the absolute 
    # values of x are below 1, and the min and max of each of these regions
    for start, stop in contiguous_regions(condition):
        segment = x[start:stop]
        print start, stop
        print segment.min(), segment.max()

def contiguous_regions(condition):
    """Finds contiguous True regions of the boolean array "condition". Returns
    a 2D array where the first column is the start index of the region and the
    second column is the end index."""

    # Find the indicies of changes in "condition"
    d = np.diff(condition)
    idx, = d.nonzero() 

    # We need to start things after the change in "condition". Therefore, 
    # we'll shift the index by 1 to the right.
    idx += 1

    if condition[0]:
        # If the start of condition is True prepend a 0
        idx = np.r_[0, idx]

    if condition[-1]:
        # If the end of condition is True, append the length of the array
        idx = np.r_[idx, condition.size] # Edit

    # Reshape the result into two columns
    idx.shape = (-1,2)
    return idx

main()
用户回答回答于

如下:

from scipy.ndimage import gaussian_filter
sigma = 3
threshold = 1
above_threshold = gaussian_filter(data, sigma=sigma) > threshold

扫码关注云+社区