Parts-based models and local features for automatic speech recognition
暂无分享,去创建一个
[1] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.
[2] Geoffrey Zweig,et al. Speech Recognition with Dynamic Bayesian Networks , 1998, AAAI/IAAI.
[3] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[4] Simon King,et al. Speech production knowledge in automatic speech recognition. , 2007, The Journal of the Acoustical Society of America.
[5] Darryl Stewart,et al. Subband correlation and robust speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.
[6] Yali Amit,et al. Robust acoustic object detection. , 2005, The Journal of the Acoustical Society of America.
[7] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[8] DeLiang Wang,et al. Binary and ratio time-frequency masks for robust speech recognition , 2006, Speech Commun..
[9] Jonathan Le Roux,et al. Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Karen Livescu,et al. Feature-based pronunciation modeling for automatic speech recognition , 2005 .
[11] M.G. Bellanger,et al. Digital processing of speech signals , 1980, Proceedings of the IEEE.
[12] Mari Ostendorf,et al. From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..
[13] Tomaso A. Poggio,et al. Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..
[14] James R. Glass. A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..
[15] Daniel P. W. Ellis,et al. Towards single-channel unsupervised source separation of speech mixtures: the layered harmonics/formants separation-tracking model , 2004, SAPA@INTERSPEECH.
[16] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[17] Ljubomir Josifovski,et al. Robust Automatic Speech Recognition with Missing and Unreliable Data , 2003 .
[18] Michael I. Jordan,et al. Boltzmann Chains and Hidden Markov Models , 1994, NIPS.
[19] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[20] N. Morgan,et al. Pushing the envelope - aside [speech recognition] , 2005, IEEE Signal Processing Magazine.
[21] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .
[22] Jeff A. Bilmes,et al. Graphical models and automatic speech recognition , 2002 .
[23] V.W. Zue,et al. The use of speech knowledge in automatic speech recognition , 1985, Proceedings of the IEEE.
[24] Tony Ezzat,et al. Discriminative word-spotting using ordered spectro-temporal patch features , 2008, SAPA@INTERSPEECH.
[25] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[26] N. One,et al. Explicit Duration Modelling in HMM / ANN Hybrids , 2005 .
[27] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[28] Jessika Eichel,et al. FUNDAMENTALS OF HEARING: AN INTRODUCTION , 1978, The Ulster Medical Journal.
[29] Ning Ma,et al. Exploiting correlogram structure for robust speech recognition with multiple speech sources , 2007, Speech Commun..
[30] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[31] A. Liberman,et al. Some Cues for the Distinction Between Voiced and Voiceless Stops in Initial Position , 1957 .
[32] Andrew K. Halberstadt. Heterogeneous acoustic measurements and multiple classifiers for speech recognition , 1999 .
[33] Li Lee,et al. Speaker normalization using efficient frequency warping procedures , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[34] Michael I. Jordan,et al. Graphical models: Probabilistic inference , 2002 .
[35] D. Wang,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2008, IEEE Trans. Neural Networks.
[36] Lori F Lamei. Formalizing knowledge used in spectrogram reading : acoustic and perceptual evidence from stops , 1988 .
[37] Daniel P. W. Ellis,et al. Decoding speech in the presence of other sources , 2005, Speech Commun..
[38] James R. Glass,et al. Speech recognition with localized time-frequency pattern detectors , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[39] Odette Scharenborg,et al. Comparing human and machine recognition performance on a VCV corpus , 2008 .
[40] Hervé Bourlard,et al. A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[41] Rainer Lienhart,et al. An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.
[42] A. Liberman,et al. The role of selected stimulus-variables in the perception of the unvoiced stop consonants. , 1952, The American journal of psychology.
[43] Kuansan Wang,et al. Spectral shape analysis in the central auditory system , 1995, IEEE Trans. Speech Audio Process..
[44] Odette Scharenborg,et al. Reaching over the gap: A review of efforts to link human and automatic speech recognition research , 2007, Speech Commun..
[45] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[46] Lori Lamel,et al. Formalizing knowledge used in spectrogram reading: acoustic and perceptual evidence from stops , 1988 .
[47] Brendan J. Frey,et al. Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.
[48] Coarticulation • Suprasegmentals,et al. Acoustic Phonetics , 2019, The SAGE Encyclopedia of Human Communication Sciences and Disorders.
[49] Paul A. Viola,et al. Robust Real-time Object Detection , 2001 .
[50] Mari Ostendorf,et al. Moving beyond the 'beads-on-a-string' model of speech , 1999 .
[51] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..
[52] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[53] William T. Freeman,et al. Understanding belief propagation and its generalizations , 2003 .
[54] Hynek Hermansky,et al. Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[55] Hynek Hermansky,et al. Towards increasing speech recognition error rates , 1995, Speech Commun..
[56] Daniel P. Huttenlocher,et al. Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.
[57] Roy D. Patterson,et al. A Dynamic Compressive Gammachirp Auditory Filterbank , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[58] David Gelbart,et al. Improving word accuracy with Gabor feature extraction , 2002, INTERSPEECH.
[59] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[60] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[61] Michael Kleinschmidt,et al. Localized spectro-temporal features for automatic speech recognition , 2003, INTERSPEECH.
[62] Ron Cole,et al. The ISOLET spoken letter database , 1990 .
[63] Ronald L. Rivest,et al. Introduction to Algorithms, Second Edition , 2001 .
[64] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[65] Richard Lippmann,et al. Speech recognition by machines and humans , 1997, Speech Commun..
[66] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[67] Odette Scharenborg,et al. The interspeech 2008 consonant challenge , 2008, INTERSPEECH.
[68] Martin Cooke,et al. A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.
[69] C. D. Forgie,et al. Automatic Recognition of Spoken Digits , 1958 .
[70] S A Shamma,et al. Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. , 2001, Journal of neurophysiology.
[71] Victor W. Zue,et al. Visual characterization of speech spectrograms , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[72] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.
[73] Daniel Patrick Whittlesey Ellis,et al. Prediction-driven computational auditory scene analysis , 1996 .
[74] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .
[75] Powen Ru,et al. Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.
[76] Martin A. Fischler,et al. The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.
[77] Antonio Torralba,et al. Describing Visual Scenes Using Transformed Objects and Parts , 2008, International Journal of Computer Vision.
[78] Ronald A. Cole,et al. Performing fine phonetic distinctions: templates versus features , 1990 .
[79] Simon King,et al. Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[80] Alvin M. Liberman,et al. Speech: A Special Code , 1996 .
[81] A. Liberman,et al. Some Experiments on the Perception of Synthetic Speech Sounds , 1952 .
[82] Richard M. Stern,et al. Reconstruction of incomplete spectrograms for robust speech recognition , 2000 .
[83] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[84] Robert E. Schapire,et al. The Boosting Approach to Machine Learning An Overview , 2003 .
[85] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[86] Roger K. Moore,et al. Towards capturing fine phonetic variation in speech using articulatory features , 2007, Speech Commun..
[87] Steve Young,et al. The HTK book , 1995 .
[88] Alex Acero,et al. Hidden conditional random fields for phone classification , 2005, INTERSPEECH.
[89] Tony Ezzat,et al. Localized spectro-temporal cepstral analysis of speech , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[90] Hermann Ney,et al. Improved methods for vocal tract normalization , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[91] A. Liberman,et al. Minimal Rules for Synthesizing Speech , 1959 .
[92] Simon King,et al. Articulatory feature recognition using dynamic Bayesian networks , 2007, Comput. Speech Lang..
[93] Richard S. Zemel,et al. Learning Parts-Based Representations of Data , 2006, J. Mach. Learn. Res..
[94] Katrin Kirchhoff,et al. Robust speech recognition using articulatory information , 1998 .
[95] Thomas Serre,et al. Component-based face detection , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[96] Jont B. Allen,et al. How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..
[97] Thomas Serre,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[98] R. Rifkin,et al. Notes on Regularized Least Squares , 2007 .