A low complexity model adaptation approach involving sparse coding over multiple dictionaries

The work presented in this paper describes a novel on-line adaptation approach for extremely low adaptation data scenario. The proposed approach extends a similar redundant dictionary based approach reported recently in literature. In this work, the orthogonal matching pursuit (OMP) algorithm is used for bases selection instead of the matching pursuit (MP). This helps in avoiding the selection of an atom more than once. Furthermore, this work also explores the use of cluster-specific eigenvoices to capture local acoustic details unlike the conventional eigenvoices technique. These approaches are then combined to reduce the number of weight parameters being estimated for deriving adapted model. Towards this purpose, separate sparse coding of the test data is performed over a set of dictionaries. Those sparse coded supervectors are then scaled and used as the Gaussian mean parameter in the adapted model. Consequently, only a few scaling factors are needed to be estimated. Such a reduction in number of parameters is highly desirable for on-line applications where the latency is a major factor.

[1]  James R. Glass,et al.  A comparison of novel techniques for instantaneous speaker adaptation , 1997, EUROSPEECH.

[2]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[3]  Adam Belloum,et al.  A real-time speech recognition architecture for a multi-channel interactive voice response system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[4]  Roland Kuhn,et al.  Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..

[5]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[6]  Yu Tsao,et al.  Segmental eigenvoice with delicate eigenspace for improved speaker adaptation , 2005, IEEE Transactions on Speech and Audio Processing.

[7]  Syed Shahnawazuddin,et al.  Improved Bases Selection in Acoustic Model Interpolation for Fast On-Line Adaptation , 2014, IEEE Signal Processing Letters.

[8]  Wei-Qiang Zhang,et al.  Rapid speaker adaptation using compressive sensing , 2013, Speech Commun..

[9]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[10]  Yongwon Jeong Speaker adaptation using probabilistic linear discriminant analysis for continuous speech recognition , 2013 .

[11]  Lawrence R. Rabiner,et al.  Applications of speech recognition in the area of telecommunications , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[12]  Roger Hsiao,et al.  Improving Reference Speaker Weighting Adaptation by the Use of Maximum-Likelihood Reference Speakers , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[13]  Vassilios Digalakis,et al.  Speaker adaptation using constrained estimation of Gaussian mixtures , 1995, IEEE Trans. Speech Audio Process..

[14]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[15]  Hyung Soon Kim,et al.  New Speaker Adaptation Method Using 2-D PCA , 2010, IEEE Signal Processing Letters.

[16]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[17]  J. Foote,et al.  WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .

[18]  Jie Zhu,et al.  A novel method for rapid speaker adaptation based on support speaker weighting , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[19]  Mark J. F. Gales Cluster adaptive training of hidden Markov models , 2000, IEEE Trans. Speech Audio Process..

[20]  S Shahnawazuddin,et al.  Assamese spoken query system to access the price of agricultural commodities , 2013, 2013 National Conference on Communications (NCC).