文章/答案/技术大牛

发布

社区首页 >问答首页 >如何在时域计算基频f( 0)？

问如何在时域计算基频f( 0)？
EN

Stack Overflow用户

提问于 2020-04-30 23:47:33

回答 1查看 1.3K关注 0票数 2

我是新的数字信号处理器，试图计算基频( f(0) )的每个分割帧的音频文件。F0估计方法可分为三类：

基于信号时域的时间动力学；
基于频率结构的频域
混合方法

大多数的例子是基于频率结构频域的基频估计，我正在寻找的是基于信号时域的时间动力学。

这篇文章提供了一些信息，但我仍然不清楚如何在时间域计算它？

https://gist.github.com/endolith/255291

我发现，到目前为止，这是使用的代码：

def freq_from_autocorr(sig, fs):
    """
    Estimate frequency using autocorrelation
    """
    # Calculate autocorrelation and throw away the negative lags
    corr = correlate(sig, sig, mode='full')
    corr = corr[len(corr)//2:]

    # Find the first low point
    d = diff(corr)
    start = nonzero(d > 0)[0][0]

    # Find the next peak after the low point (other than 0 lag).  This bit is
    # not reliable for long signals, due to the desired peak occurring between
    # samples, and other peaks appearing higher.
    # Should use a weighting function to de-emphasize the peaks at longer lags.
    peak = argmax(corr[start:]) + start
    px, py = parabolic(corr, peak)

    return fs / px

如何在时域内进行估计？

提前感谢！

scipy

signal-processing

python

numpy

audio

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-05-01 06:23:09

这是一个正确的实现。不太强壮，但肯定有效。为了验证这一点，我们可以产生一个已知频率的信号，看看我们将得到什么结果：

import numpy as np
from scipy.io import wavfile
from scipy.signal import correlate, fftconvolve
from scipy.interpolate import interp1d

fs = 44100
frequency = 440
length = 0.01 # in seconds

t = np.linspace(0, length, int(fs * length)) 
y = np.sin(frequency * 2 * np.pi * t)

def parabolic(f, x):
    xv = 1/2. * (f[x-1] - f[x+1]) / (f[x-1] - 2 * f[x] + f[x+1]) + x
    yv = f[x] - 1/4. * (f[x-1] - f[x+1]) * (xv - x)
    return (xv, yv)

def freq_from_autocorr(sig, fs):
    """
    Estimate frequency using autocorrelation
    """
    corr = correlate(sig, sig, mode='full')
    corr = corr[len(corr)//2:]
    d = np.diff(corr)
    start = np.nonzero(d > 0)[0][0]
    peak = np.argmax(corr[start:]) + start
    px, py = parabolic(corr, peak)

    return fs / px

结果

运行freq_from_autocorr(y, fs)可以获得~442.014 Hz，大约0.45%的错误。

奖金-我们可以改进

我们可以通过稍微多一点的编码使它更加精确和健壮：

def indexes(y, thres=0.3, min_dist=1, thres_abs=False):
    """Peak detection routine borrowed from 
    https://bitbucket.org/lucashnegri/peakutils/src/master/peakutils/peak.py
    """
    if isinstance(y, np.ndarray) and np.issubdtype(y.dtype, np.unsignedinteger):
        raise ValueError("y must be signed")

    if not thres_abs:
        thres = thres * (np.max(y) - np.min(y)) + np.min(y)

    min_dist = int(min_dist)

    # compute first order difference
    dy = np.diff(y)

    # propagate left and right values successively to fill all plateau pixels (0-value)
    zeros, = np.where(dy == 0)

    # check if the signal is totally flat
    if len(zeros) == len(y) - 1:
        return np.array([])

    if len(zeros):
        # compute first order difference of zero indexes
        zeros_diff = np.diff(zeros)
        # check when zeros are not chained together
        zeros_diff_not_one, = np.add(np.where(zeros_diff != 1), 1)
        # make an array of the chained zero indexes
        zero_plateaus = np.split(zeros, zeros_diff_not_one)

        # fix if leftmost value in dy is zero
        if zero_plateaus[0][0] == 0:
            dy[zero_plateaus[0]] = dy[zero_plateaus[0][-1] + 1]
            zero_plateaus.pop(0)

        # fix if rightmost value of dy is zero
        if len(zero_plateaus) and zero_plateaus[-1][-1] == len(dy) - 1:
            dy[zero_plateaus[-1]] = dy[zero_plateaus[-1][0] - 1]
            zero_plateaus.pop(-1)

        # for each chain of zero indexes
        for plateau in zero_plateaus:
            median = np.median(plateau)
            # set leftmost values to leftmost non zero values
            dy[plateau[plateau < median]] = dy[plateau[0] - 1]
            # set rightmost and middle values to rightmost non zero values
            dy[plateau[plateau >= median]] = dy[plateau[-1] + 1]

    # find the peaks by using the first order difference
    peaks = np.where(
        (np.hstack([dy, 0.0]) < 0.0)
        & (np.hstack([0.0, dy]) > 0.0)
        & (np.greater(y, thres))
    )[0]

    # handle multiple peaks, respecting the minimum distance
    if peaks.size > 1 and min_dist > 1:
        highest = peaks[np.argsort(y[peaks])][::-1]
        rem = np.ones(y.size, dtype=bool)
        rem[peaks] = False

        for peak in highest:
            if not rem[peak]:
                sl = slice(max(0, peak - min_dist), peak + min_dist + 1)
                rem[sl] = True
                rem[peak] = False

        peaks = np.arange(y.size)[~rem]

    return peaks

def freq_from_autocorr_improved(signal, fs):
    signal -= np.mean(signal)  # Remove DC offset
    corr = fftconvolve(signal, signal[::-1], mode='full')
    corr = corr[len(corr)//2:]

    # Find the first peak on the left
    i_peak = indexes(corr, thres=0.8, min_dist=5)[0]
    i_interp = parabolic(corr, i_peak)[0]

    return fs / i_interp, corr, i_interp

运行freq_from_autocorr_improved(y, fs)会产生~441.825 Hz，误差约为0.41%。对于更复杂的情况，这种方法的性能会更好，计算时间会长两倍。

通过更长的采样时间(即将length设置为0.1s)，我们将得到更准确的结果。

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61534687

复制

相似问题

问如何在时域计算基频f( 0)？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在时域计算基频f( 0)？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在时域计算基频f( 0)？
EN