论文信息 - A Segmental HMM for Speech Waveforms

A Segmental HMM for Speech Waveforms

We present a purely time domain approach to speech processing which identies waveform samples at the boundaries between glottal pulse periods (in voiced speech) or at the boundaries between unvoiced segments. An ecien t algorithm for inferring these boundaries is derived from a simple probabilistic generative model of speech and state of the art results are presented on pitch tracking, voiced/unvoiced detection and timescale modication.

[1] A. Wilgus,et al. High quality time-scale modification for speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Fabrice Plante,et al. A pitch extraction reference database , 1995, EUROSPEECH.

[3] Radford M. Neal,et al. Inferring State Sequences for Non-linear Systems with Embedded Hidden Markov Models , 2003, NIPS.

[4] B. Frey,et al. Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..