A Bayesian Predictive Method for Automatic Speech Segmentation

Implicit speech segmentation is basically to find time instances when the spectral distortion is large. Spectral variation function is a widely used measure of spectral distortion. However, SVF is a data-dependent measure. In order to make the measurement data-independent, a likelihood ratio is constructed to measure the spectral distortion. This ratio can be computed efficiently with a Bayesian predictive model. The prior of the Bayesian predictive model is estimated from unlabeled data via an unsupervised machine learning technique - Gaussian mixture model (GMM). The experimental results show that effectiveness of this novel method. The performance on TIMIT corpus indicates the potential applications in speech recognition, synthesis and coding