Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation
暂无分享,去创建一个
This paper proposes the application of tree-structured clustering to the processing of noisy speech collected under various SNR conditions in the framework of piecewise-linear transformation (PLT)-based HMM adaptation for noisy speech. Three kinds of clustering methods are described: a one-step clustering method that integrates noise and SNR conditions and two two-step clustering methods that construct trees for each SNR condition. According to the clustering results, a noisy speech HMM is made for each node of the tree structure. Based on the likelihood maximization criterion, the HMM that best matches the input speech is selected by tracing the tree from top to bottom, and the selected HMM is further adapted by linear transformation. The proposed methods are evaluated by applying them to a Japanese dialogue recognition system. The results confirm that the proposed methods are effective in recognizing digitally noise-added speech and actual noisy speech issued by a wide range of speakers under various noise conditions. The results also indicate that the one-step clustering method gives better performance than the two-step clustering methods.
[1] Zhipeng Zhang,et al. Piecewise-linear transformation-based HMM adaptation for noisy speech , 2004, Speech Commun..
[2] Tetsuo Kosaka,et al. Speaker-independent speech recognition based on tree-structured speaker clustering , 1996, Comput. Speech Lang..
[3] Sadaoki Furui,et al. A maximum likelihood procedure for a universal adaptation method based on HMM composition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[4] Roger K. Moore. Computer Speech and Language , 1986 .