Noise robust ASR: Missing data techniques and beyond

[1]  Hugo Van hamme Robust speech recognition using missing feature theory in the cepstral or LDA domain , 2003, INTERSPEECH.

[2]  Jon Barker,et al.  Soft decisions in missing data techniques for robust automatic speech recognition , 2000, INTERSPEECH.

[3]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[4]  Daniel P. W. Ellis,et al.  Decoding speech in the presence of other sources , 2005, Speech Commun..

[5]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[6]  Phil D. Green,et al.  Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study , 1999, EUROSPEECH.

[7]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[8]  Hugo Van hamme,et al.  Handling convolutional noise in missing data automatic speech recognition , 2006, INTERSPEECH.

[9]  Hugo Van hamme,et al.  Vector-quantization based mask estimation for missing data automatic speech recognition , 2007, INTERSPEECH.

[10]  Bhiksha Raj,et al.  Sparse Overcomplete Decomposition for Single Channel Speaker Separation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Jon Barker,et al.  An automatic speech recognition system based on the scene analysis account of auditory perception , 2007, Speech Commun..

[12]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[13]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[14]  H. Van hamme,et al.  Robust speech recognition using cepstral domain missing data techniques and noisy masks , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Hugo Van hamme,et al.  PROSPECT features and their application to missing data techniques for vocal tract length normalization , 2005, INTERSPEECH.

[16]  Paul Dalsgaard,et al.  Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Jeff A. Bilmes,et al.  MVA Processing of Speech Features , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Mike E. Davies,et al.  Compressed Sensing and Source Separation , 2007, ICA.

[19]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[20]  Jean Paul Haton,et al.  Accurate marginalization range for missing data recognition , 2007, INTERSPEECH.

[21]  Richard M. Stern,et al.  Reconstruction of incomplete spectrograms for robust speech recognition , 2000 .

[22]  Krzysztof Marasek,et al.  SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation , 2002, LREC.

[23]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[24]  Patrick Wambacq,et al.  Template-Based Continuous Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[26]  VargaAndrew,et al.  Assessment for automatic speech recognition II , 1993 .

[27]  Vesa Siivola,et al.  Growing an n-gram language model , 2005, INTERSPEECH.

[28]  Ning Ma,et al.  Exploiting correlogram structure for robust speech recognition with multiple speech sources , 2007, Speech Commun..

[29]  Paris Smaragdis,et al.  Convolutive Speech Bases and Their Application to Supervised Speech Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[30]  Mikkel N. Schmidt,et al.  Shift Invariant Sparse Coding of Image and Music Data , 2007 .

[31]  Mikko Kurimo,et al.  Unlimited vocabulary speech recognition with morph language models applied to Finnish , 2006, Comput. Speech Lang..

[32]  Bert Cranen,et al.  Using sparse representations for missing data imputation in noise robust speech recognition , 2008, 2008 16th European Signal Processing Conference.

[33]  Phil D. Green,et al.  State based imputation of missing data for robust speech recognition and speech enhancement , 1999, EUROSPEECH.

[34]  Hugo Van hamme,et al.  Robust speech recognition using missing data techniques in the prospect domain and fuzzy masks , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[35]  Li Deng,et al.  HMM adaptation using vector taylor series for noisy speech recognition , 2000, INTERSPEECH.

[36]  J. Gemmeke,et al.  Detecting irregular orbits in gravitational N-body simulations , 2006, astro-ph/0607343.

[37]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[38]  Daniel P. W. Ellis,et al.  Feature extraction using non-linear transformation for robust speech recognition on the Aurora database , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[39]  Gaël Richard,et al.  The speechdat-car multilingual speech databases for in-car applications: some first validation results , 1999, EUROSPEECH.

[40]  Mark J. F. Gales,et al.  Robust continuous speech recognition using parallel model combination , 1996, IEEE Trans. Speech Audio Process..

[41]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[42]  Dirk Van Compernolle,et al.  Optimal feature sub-space selection based on discriminant analysis , 1999, EUROSPEECH.

[43]  R.M. Stern,et al.  Missing-feature approaches in speech recognition , 2005, IEEE Signal Processing Magazine.

[44]  D. Donoho,et al.  Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA) , 2005 .

[45]  E. Candès,et al.  Sparsity and incoherence in compressive sampling , 2006, math/0611957.

[46]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[47]  Mikkel N. Schmidt,et al.  Linear Regression on Sparse Features for Single-Channel Speech Separation , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[48]  Brendan J. Frey,et al.  ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition , 2001, INTERSPEECH.

[49]  DeLiang Wang,et al.  Binary and ratio time-frequency masks for robust speech recognition , 2006, Speech Commun..

[50]  Mikko Kurimo,et al.  Duration modeling techniques for continuous speech recognition , 2004, INTERSPEECH.

[51]  B. Cranen,et al.  Noise reduction through compressed sensing , 2008, INTERSPEECH.

[52]  Michael Picheny,et al.  Speech recognition using noise-adaptive prototypes , 1989, IEEE Trans. Acoust. Speech Signal Process..

[53]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[54]  Hugo Van hamme MIDAS (MIssing DAta Solutions) , 2006 .

[55]  Juan Manuel Górriz,et al.  Speech/non-speech discrimination based on contextual information integrated bispectrum LRT , 2006, IEEE Signal Processing Letters.

[56]  Guy J. Brown,et al.  Techniques for handling convolutional distortion with 'missing data' automatic speech recognition , 2004, Speech Commun..

[57]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[58]  Guy J. Brown,et al.  Mask estimation for missing data speech recognition based on statistics of binaural interaction , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[59]  Richard M. Stern,et al.  Reconstruction of missing features for robust speech recognition , 2004, Speech Commun..

[60]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[61]  Phil D. Green,et al.  Missing data techniques for robust speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[62]  Jj Odell,et al.  The Use of Context in Large Vocabulary Speech Recognition , 1995 .

[63]  René Vidal,et al.  Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  S. Hanson,et al.  Some Solutions to the Missing Feature Problem in Vision , 1993 .

[65]  Jon Barker,et al.  Robust ASR based on clean speech models: an evaluation of missing data techniques for connected digit recognition in noise , 2001, INTERSPEECH.

[66]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[67]  Simon J. Godsill,et al.  Bayesian extensions to non-negative matrix factorisation for audio signal modelling , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[68]  Richard M. Stern,et al.  Band-Independent Mask Estimation for Missing-Feature Reconstruction in the Presence of Unknown Background Noise , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[69]  Alex Acero,et al.  Training wideband acoustic models using mixed-bandwidth training data via feature bandwidth extension , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[70]  H. Lane,et al.  The Lombard Sign and the Role of Hearing in Speech , 1971 .

[71]  李幼升,et al.  Ph , 1989 .

[72]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[73]  Hugo Van hamme,et al.  PROSPECT features and their application to missing data techniques for robust speech recognition , 2004, INTERSPEECH.

[74]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[75]  Tuomas Virtanen,et al.  Separation of sound sources by convolutive sparse coding , 2004, SAPA@INTERSPEECH.

[76]  Janne Pylkkönen AN EFFICIENT ONE-PASS DECODER FOR FINNISH LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , .

[77]  Jean Paul Haton,et al.  On noise masking for automatic missing data speech recognition: A survey and discussion , 2007, Comput. Speech Lang..

[78]  B. Cranen,et al.  Noise robust digit recognition using sparse representations , 2008 .

[79]  Kai-Fu Lee,et al.  Automatic Speech Recognition , 1989 .

[80]  Mark J. F. Gales,et al.  Issues with uncertainty decoding for noise robust automatic speech recognition , 2008, Speech Commun..

[81]  S. Nakamura,et al.  Sequential Noise Compensation by Sequential Monte Carlo Method , 2001, NIPS.

[82]  Odette Scharenborg,et al.  The interspeech 2008 consonant challenge , 2008, INTERSPEECH.

[83]  Martin Cooke,et al.  A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[84]  Yin Zhang Caam When is missing data recoverable ? , 2006 .

[85]  Jort F. Gemmeke Classification on incomplete data: imputation is optional , 2008 .

[86]  Richard M. Stern,et al.  A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition , 2004, Speech Commun..

[87]  Bert Cranen,et al.  On the relation between statistical properties of spectrographic masks and recognition accuracy , 2008 .

[88]  Patrick Wambacq,et al.  Improved parameter tying for efficient acoustic model evaluation in large vocabulary continuous speech recognition , 1998, ICSLP.

[89]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[90]  Sang Ryong Kim,et al.  Application of sequential estimation to time-varying environment compensation [in speech recognition] , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[91]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[92]  Mathias Creutz,et al.  Unsupervised Discovery of Morphemes , 2002, SIGMORPHON.

[93]  Daniel P. W. Ellis,et al.  Estimating single-channel source separation masks: relevance vector machine classifiers vs. pitch-based masking , 2006, SAPA@INTERSPEECH.

[94]  Rémi Gribonval,et al.  Audio source separation with a single sensor , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[95]  Marc Moonen,et al.  Double-Talk-Robust Prediction Error Identification Algorithms for Acoustic Echo Cancellation , 2007, IEEE Transactions on Signal Processing.

[96]  Hynek Hermansky,et al.  Towards increasing speech recognition error rates , 1995, Speech Commun..

[97]  Hugo Van hamme Handling Time-Derivative Features in a Missing Data Framework for Robust Automatic Speech Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[98]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[99]  Mikko Kurimo,et al.  Missing feature reconstruction and acoustic model adaptation combined for large vocabulary continuous speech recognition , 2008, 2008 16th European Signal Processing Conference.

[100]  Daniel D. Lee,et al.  Multiplicative Updates for Nonnegative Quadratic Programming in Support Vector Machines , 2002, NIPS.

[101]  Richard M. Stern,et al.  A vector Taylor series approach for environment-independent speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[102]  D. Donoho,et al.  Counting faces of randomly-projected polytopes when the projection radically lowers dimension , 2006, math/0607364.

[103]  VirtanenTuomas Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007 .

[104]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[105]  Jean-Claude Junqua,et al.  Robustness in Automatic Speech Recognition: Fundamentals and Applications , 1995 .