Voiced-unvoiced-silence classification of speech signals based on statistical approaches

Abstract The paper describes a pattern recognition approach for deciding whether a given segment of a speech signal should be classified as voiced speech, unvoiced speech, or silence, based on measurements made on the signal. In this method, five different measurements are made on the speech segment to be classified. The speech segment is assigned to a particular class based on a minimum-distance rule, obtained under the assumption that the measured parameters are distributed according to the multi-dimensional Gaussian probability density function. The means and covariances for the Gaussian distribution are determined using two statistical approaches. The method has been found to provide reliable classification with speech segments as short as 10 ms.