Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages
暂无分享,去创建一个
[1] Igor Sz. SUB-WORD MODELING OF OUT OF VOCABULARY WORDS IN SPOKEN TERM DETECTION , 2008 .
[2] Lukás Burget,et al. Sub-word modeling of out of vocabulary words in spoken term detection , 2008, 2008 IEEE Spoken Language Technology Workshop.
[3] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[4] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[5] Jean-Luc Gauvain,et al. Lightly supervised and unsupervised acoustic model training , 2002, Comput. Speech Lang..
[6] Andreas Stolcke,et al. The SRI/OGI 2006 spoken term detection system , 2007, INTERSPEECH.
[7] Brian Kingsbury,et al. Exploiting diversity for spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[9] George Zavaliagkos,et al. Utilizing untranscribed training data to improve perfomance , 1998, LREC.
[10] Jonathan G. Fiscus,et al. Results of the 2006 Spoken Term Detection Evaluation , 2006 .
[11] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[12] Jonathan G. Fiscus,et al. REDUCED WORD ERROR RATES , 1997 .
[13] Xiaodong Cui,et al. Data Augmentation for Deep Neural Network Acoustic Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Nizar Habash,et al. Unsupervised Morphology-Based Vocabulary Expansion , 2014, ACL.
[15] Geoffrey Zweig,et al. fMPE: discriminatively trained features for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[16] Xiaodong Cui,et al. A high-performance Cantonese keyword search system , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Keiichi Tokuda,et al. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis , 1999, EUROSPEECH.
[18] Xiaodong Cui,et al. System combination and score normalization for spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[19] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[20] Tanja Schultz,et al. FLEXIBLE DECISCION TREES FOR GRAPHEME BASED SPEECH RECOGNITION , 2004 .
[21] Mark J. F. Gales,et al. Progress in the CU-HTK broadcast news transcription system , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Richard M. Schwartz,et al. Score normalization and system combination for improved keyword spotting , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[23] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[24] Herbert Gish,et al. Rapid and accurate spoken term detection , 2007, INTERSPEECH.
[25] Mark J. F. Gales,et al. The efficient incorporation of MLP features into automatic speech recognition systems , 2011, Comput. Speech Lang..
[26] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[27] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[28] Navdeep Jaitly,et al. Vocal Tract Length Perturbation (VTLP) improves speech recognition , 2013 .
[29] Gunnar Evermann,et al. Large vocabulary decoding and confidence estimation using word posterior probabilities , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[30] Martin Karafiát,et al. Study of probabilistic and Bottle-Neck features in multilingual environment , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.