Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement
暂无分享,去创建一个
Björn W. Schuller | Martin Wöllmer | Gerhard Rigoll | Tobias Moosmayr | Björn Schuller | M. Wöllmer | G. Rigoll | T. Moosmayr
[1] Wu Chou,et al. Minimum classification error linear regression for acoustic model adaptation of continuous density HMMs , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[2] Hermann Ney,et al. Quantile based histogram equalization for noise robust large vocabulary speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[3] James R. Glass,et al. Noise Robust Phonetic Classificationwith Linear Regularized Least Squares and Second-Order Features , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[4] B. Cranen,et al. Noise reduction through compressed sensing , 2008, INTERSPEECH.
[5] Tanja Schultz,et al. Comparison of acoustic model adaptation techniques on non-native speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[6] Björn W. Schuller,et al. Speech recognition in noisy environments using a switching linear dynamic model for feature enhancement , 2008, INTERSPEECH.
[7] Nam Soo Kim. Nonstationary environment compensation based on sequential estimation , 1998, IEEE Signal Processing Letters.
[8] Lee-Sup Kim,et al. An advanced contrast enhancement using partially overlapped sub-block histogram equalization , 2001, IEEE Trans. Circuits Syst. Video Technol..
[9] Denis Jouvet,et al. Evaluation of a noise-robust DSR front-end on Aurora databases , 2002, INTERSPEECH.
[10] David Barber,et al. Expectation Correction for Smoothed Inference in Switching Linear Dynamical Systems , 2006, J. Mach. Learn. Res..
[11] A. Stolcke,et al. NOISE-RESISTANT FEATURE EXTRACTION AND MODEL TRAINING FOR ROBUST SPEECH RECOGNITION , 1996 .
[12] Tet Hin Yeap,et al. Noisy Speech Feature Estimation on the Aurora2 Database using a Switching Linear Dynamic Model , 2007, J. Multim..
[13] Abeer Alwan,et al. HMM-based estimation of unreliable spectral components for noise robust speech recognition , 2008, INTERSPEECH.
[14] José L. Pérez-Córdoba,et al. Histogram equalization of speech representation for robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.
[15] Joseph Picone,et al. Applications of support vector machines to speech recognition , 2004, IEEE Transactions on Signal Processing.
[16] Jeff A. Bilmes,et al. Maximum mutual information based reduction strategies for cross-correlation based joint distributional modeling , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[17] C. Striebel,et al. On the maximum likelihood estimates for linear dynamic systems , 1965 .
[18] Björn W. Schuller,et al. Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech , 2008, PIT.
[19] David Barber,et al. Switching Linear Dynamical Systems for Noise Robust Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[20] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[21] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[22] Björn W. Schuller,et al. Hidden Conditional Random Fields for Meeting Segmentation , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[23] Steven Greenberg,et al. Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..
[24] Hong Kook Kim,et al. Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments , 2001, IEEE Trans. Speech Audio Process..
[25] Rhee Man Kil,et al. Auditory processing of speech signals for robust speech recognition in real-world noisy environments , 1999, IEEE Trans. Speech Audio Process..
[26] Saeed Vaseghi,et al. Noise compensation methods for hidden Markov model speech recognition in adverse environments , 1997, IEEE Trans. Speech Audio Process..
[27] Trevor Darrell,et al. Conditional Random Fields for Object Recognition , 2004, NIPS.
[28] Richard Lippmann,et al. A comparison of signal processing front ends for automatic word recognition , 1995, IEEE Trans. Speech Audio Process..
[29] Jean Paul Haton,et al. Compensation of noise effects for robust speech recognition in car environments , 2000, INTERSPEECH.
[30] Rainer Martin,et al. SPEECH ENHANCEMENT IN THE DFT DOMAIN USING LAPLACIAN SPEECH PRIORS , 2003 .
[31] G. R. Doddington,et al. Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today , 1981, IEEE Spectrum.
[32] Saeed Vaseghi,et al. Speech recognition in noisy environments , 1992, ICSLP.
[33] Björn W. Schuller,et al. Towards More Reality in the Recognition of Emotional Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[34] Misha Pavel,et al. On the relative importance of various components of the modulation spectrum for automatic speech recognition , 1999, Speech Commun..
[35] Jürgen Schmidhuber,et al. An Application of Recurrent Neural Networks to Discriminative Keyword Spotting , 2007, ICANN.
[36] Odette Scharenborg,et al. The interspeech 2008 consonant challenge , 2008, INTERSPEECH.
[37] Martin Bouchard,et al. Comb filter decomposition for robust ASR , 2005, INTERSPEECH.
[38] Hakan Erdogan,et al. Incremental on-line feature space MLLR adaptation for telephony speech recognition , 2002, INTERSPEECH.
[39] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[40] Li Deng,et al. Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition , 2003, IEEE Trans. Speech Audio Process..
[41] Sarel van Vuuren,et al. Relevance of time-frequency features for phonetic and speaker-channel classification , 2000, Speech Commun..
[42] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[43] Detlev Langmann,et al. Acoustic front ends for speaker-independent digit recognition in car environments , 1997, EUROSPEECH.
[44] John N. Tsitsiklis,et al. Introduction to Probability , 2002 .
[45] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.
[46] Yaakov Bar-Shalom,et al. Estimation and Tracking: Principles, Techniques, and Software , 1993 .
[47] H. Bourlard,et al. Unsupervised spectral subtraction for noise-robust ASR , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..
[48] Charles M. Grinstead,et al. Introduction to probability , 1999, Statistics for the Behavioural Sciences.
[49] Alex Acero,et al. Hidden conditional random fields for phone classification , 2005, INTERSPEECH.
[50] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[51] Alex Acero,et al. Noise robust speech recognition with a switching linear dynamic model , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[52] Guillaume Lathoud. Channel Normalization for Unsupervised Spectral Subtraction , 2006 .
[53] Hynek Hermansky. TRAP-TANDEM: data-driven extraction of temporal features from speech , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[54] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[55] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[56] Trevor Darrell,et al. Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[57] Reinhold Häb-Umbach,et al. Modeling the dynamics of speech and noise for speech feature enhancement in ASR , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[58] Hynek Hermansky,et al. RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[59] Brian Roark,et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm , 2004, ACL.
[60] M.G. Rahim,et al. Signal conditioning techniques for robust speech recognition , 1996, IEEE Signal Processing Letters.
[61] Rahul Sarpeshkar,et al. An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition , 2007, EURASIP J. Audio Speech Music. Process..
[62] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..
[63] Björn W. Schuller,et al. On the Necessity and Feasibility of Detecting a Driver's Emotional State While Driving , 2007, ACII.
[64] Hynek Hermansky,et al. Evaluation and optimization of perceptually-based ASR front-end , 1993, IEEE Trans. Speech Audio Process..
[65] Richard M. Stern,et al. A vector Taylor series approach for environment-independent speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[66] Abeer Alwan,et al. On the use of variable frame rate analysis in speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[67] Peter Jancovic,et al. On the mask modeling and feature representation in the missing-feature ASR: evaluation on the Consonant Challenge , 2008, INTERSPEECH.
[68] Li Deng,et al. A comparison of three non-linear observation models for noisy speech features , 2003, INTERSPEECH.
[69] Richard M. Stern,et al. Environmental robustness in automatic speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[70] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[71] W. Bruce Croft,et al. Table extraction using conditional random fields , 2003, DG.O.
[72] William J. J. Roberts,et al. Revisiting autoregressive hidden Markov modeling of speech signals , 2005, IEEE Signal Processing Letters.
[73] Adrião Duarte Dória Neto,et al. Digit recognition using wavelet and SVM in Brazilian Portuguese , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.