A Segmental HMM for Speech Waveforms
暂无分享,去创建一个
We present a purely time domain approach to speech processing which identies waveform samples at the boundaries between glottal pulse periods (in voiced speech) or at the boundaries between unvoiced segments. An ecien t algorithm for inferring these boundaries is derived from a simple probabilistic generative model of speech and state of the art results are presented on pitch tracking, voiced/unvoiced detection and timescale modication.
[1] A. Wilgus,et al. High quality time-scale modification for speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[2] Fabrice Plante,et al. A pitch extraction reference database , 1995, EUROSPEECH.
[3] Radford M. Neal,et al. Inferring State Sequences for Non-linear Systems with Embedded Hidden Markov Models , 2003, NIPS.
[4] B. Frey,et al. Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..