论文信息 - A Minimax Classification Approach With Application To Robust Speech Recognition

A Minimax Classification Approach With Application To Robust Speech Recognition

A minimax approach for robust classification of parametric information sources is studied and applied to isolated-word speech recognition based on hidden Markov modeling. The goal is to reduce the sensitivity of speech recognition systems to a possible mismatch between the training and testing conditions. To this end, a generalized likelihood ratio test is developed and shown to be optimal in the sense of achieving the highest asymptotic exponential rate of decay of the error probability for the worst-case mismatch situation. The proposed approach is compared to the standard approach, where no mismatch is assumed, in recognition of noisy speech and in other realistic mismatch situations. >

Chin-Hui Lee | Neri Merhav | N. Merhav | Chin-Hui Lee

[1] Dirk Van Compernolle. Noise adaptation in a hidden Markov model speech recognition system , 1989 .

[2] L. R. Rabiner,et al. Some properties of continuous hidden Markov model representations , 1985, AT&T Technical Journal.

[3] Oded Ghitza,et al. Auditory nerve representation as a front-end for speech recognition in a noisy environment , 1986 .

[4] N. Merhav,et al. Hidden Markov modeling using a dominant state sequence with application to speech recognition , 1991 .

[5] D. Van Compernolle. Increased noise immunity in large vocabulary speech recognition with the aid of spectral subtraction , 1987, ICASSP.

[6] B. Atal. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[7] J. Makhoul,et al. Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[8] Michael Picheny,et al. Speech recognition using noise-adaptive prototypes , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9] Biing-Hwang Juang,et al. Recent developments in speech recognition under adverse conditions , 1990, ICSLP.

[10] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[11] Biing-Hwang Juang,et al. On the application of hidden Markov models for enhancing noisy speech , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[12] Chin-Hui Lee,et al. On the asymptotic statistical behavior of empirical cepstral coefficients , 1993, IEEE Trans. Signal Process..

[13] Neri Merhav,et al. A Bayesian classification approach with application to speech recognition , 1991, IEEE Trans. Signal Process..

[14] Biing-Hwang Juang,et al. The short-time modified coherence representation and noisy speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[15] John H. L. Hansen,et al. Constrained iterative speech enhancement with application to automatic speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[16] Yeunung Chen,et al. Cepstral domain stress compensation for robust speech recogniton , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[18] Yariv Ephraim. Gain-adapted hidden Markov models for recognition of clean and noisy speech , 1992, IEEE Trans. Signal Process..

[19] A. Erell,et al. Estimation using log-spectral-distance criterion for noise-robust speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[20] M. Hunt,et al. Speaker dependent and independent speech recognition experiments with an auditory model , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[21] W. Hoeffding. Asymptotically Optimal Tests for Multinomial Distributions , 1965 .

[22] P. J. Huber. Robust Statistical Procedures , 1977 .

[23] Lawrence R. Rabiner,et al. Some performance benchmarks for isolated work speech recognition systems , 1987 .

[24] A. Nadas,et al. Adaptive labeling: normalization of speech by adaptive transformations based on vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[25] Biing-Hwang Juang,et al. A family of distortion measures based upon projection operation for robust speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[26] Lawrence R. Rabiner,et al. A minimum discrimination information approach for hidden Markov modeling , 1989, IEEE Trans. Inf. Theory.

[27] Harry L. Van Trees,et al. Detection, Estimation, and Modulation Theory, Part I , 1968 .

[28] William J. Byrne,et al. The Auditory Processing and Recognition of Speech , 1989, HLT.

[29] Richard M. Stern,et al. Environmental robustness in automatic speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[30] J. Makhoul,et al. On the statistics of the estimated reflection coefficients of an autoregressive process , 1983 .

[31] D. B. Roe. Speech recognition with a noise-adapting codebook , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32] Biing-Hwang Juang,et al. On the use of bandpass liftering in speech recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[33] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[34] H. Matsumoto,et al. Comparative study of various spectrum matching measures on noise robustness , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35] D. Mansour,et al. The short-time modified coherence representation and its application for noisy speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[36] John H. L. Hansen,et al. Stress compensation and noise reduction algorithms for robust speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[37] I. Csiszár. Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems , 1991 .

[38] Neri Merhav,et al. On the estimation of the order of a Markov chain and universal data compression , 1989, IEEE Trans. Inf. Theory.

[39] D. A. Preece,et al. An introduction to the statistical analysis of data , 1979 .

[40] Biing-Hwang Juang,et al. Signal restoration by spectral mapping , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[41] Chin-Hui Lee,et al. Speech recognition under additive noise , 1984, ICASSP.

[42] B.S. Atal,et al. Automatic recognition of speakers from their voices , 1976, Proceedings of the IEEE.

[43] Steven F. Boll,et al. Optimal estimators for spectral restoration of noisy speech , 1984, ICASSP.

[44] IEEE Transactions on Speech and Audio Processing , 2022 .

[45] I. Csiszár. Why least squares and maximum entropy? An axiomatic approach to inverse problems , 1990 .

[46] Man Mohan Sondhi,et al. A frequency-weighted Itakura spectral distortion measure and its application to speech recognition in noise , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[47] Oded Ghitza. Robustness against noise: The role of timing-synchrony measurement , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[48] Byoung-Seon Choi,et al. Conditional limit theorems under Markov conditioning , 1987, IEEE Trans. Inf. Theory.

[49] Brian Hanson,et al. Robust speaker-independent word recognition using static, dynamic and acceleration features: experiments with Lombard and noisy speech , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[50] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .

[51] Stefan Dobler,et al. Real-time connected-word recognition in a noisy environment , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[52] D. van Compernolle. Spectral estimation using a log-distance error criterion applied to speech recognition , 1989, ICASSP.

[53] R. Ellis,et al. Large deviations and statistical mechanics , 1985 .

[54] Rodney W. Johnson,et al. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy , 1980, IEEE Trans. Inf. Theory.

[55] D. B. Paul. A speaker-stress resistant HMM isolated word recognizer , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[56] E. A. Martin,et al. Multi-style training for robust isolated-word speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[57] Roger K. Moore,et al. Noise compensation algorithms for use with hidden Markov model based speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[58] H. Gish,et al. Probabilistic vector mapping of noisy speech parameters for HMM word spotting , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[59] S. Natarajan. Large deviations, hypotheses testing, and source coding for finite Markov chains , 1985, IEEE Trans. Inf. Theory.

[60] Michael Gutman,et al. Asymptotically optimal classification for multiple tests with empirically observed statistics , 1989, IEEE Trans. Inf. Theory.

[61] R. Ellis,et al. Entropy, large deviations, and statistical mechanics , 1985 .

[62] A. Kester,et al. Large Deviations of Estimators , 1986 .

[63] Neri Merhav,et al. The estimation of the model order in exponential families , 1989, IEEE Trans. Inf. Theory.

[64] P. J. Huber. Robust Estimation of a Location Parameter , 1964 .

[65] N. Sedgwick,et al. Noise compensation for speech recognition using probabilistic models , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[66] Donald B. Rubin,et al. Max-imum Likelihood from Incomplete Data , 1972 .

[67] R. R. Bahadur. Rates of Convergence of Estimates and Test Statistics , 1967 .

[68] Jacob Ziv,et al. On classification with empirically observed statistics and universal data compression , 1988, IEEE Trans. Inf. Theory.

[69] Yeunung Chen,et al. Cepstral domain talker stress compensation for robust speech recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[70] Biing-Hwang Juang,et al. The segmental K-means algorithm for estimating parameters of hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[71] Brian A. Hanson,et al. Spectral slope distance measures with linear prediction analysis for word recognition in noise , 1987, IEEE Trans. Acoust. Speech Signal Process..

[72] Yariv Ephraim,et al. A linear predictive front-end processor for speech recognition in noisy environments , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[73] S.A. Kassam,et al. Robust techniques for signal processing: A survey , 1985, Proceedings of the IEEE.

[74] P. J. Huber. A Robust Version of the Probability Ratio Test , 1965 .

[75] Clifford J. Weinstein,et al. Experiments in isolated word recognition using noisy speech , 1983, ICASSP.