Voicing epoch determination with dynamic programming
暂无分享,去创建一个
During voiced speech, the point of maximum flow change in each glottal cycle corresponds to the point of maximum excitation of the vocal tract. Accurate, reliable detection of this “epoch” beginning (or end) is useful for pitch synchronous analysis/synthesis in a variety of contexts. Dynamic programming has been applied to correlation function peak selection [Secrest and Doddington, ICASSP‐83] and lagged waveform matching [Ney, IEEE Trans. SMC‐12 (1982)] for F0 determination with excellent results, but these techniques do not yield the epoch locations. The method described in this paper applies dynamic programming to select waveform maxima directly from a short‐time‐energy‐normalized LPC residual. The cumulative path costs are normalized by the path length. Local costs are based on peak amplitude and quality, transition costs on period and pulse similarity. The output is the set of pulse locations that globally satisfy the cost constraints over all voiced regions. Data to be presented indicate that these ...