DA-IICT/IIITV System for Low Resource Speech Recognition Challenge 2018
暂无分享,去创建一个
Madhu R. Kamble | Maddala Venkata Siva Krishna | Hemant A. Patil | Ankur T. Patil | Hardik B. Sailor | Diksha Chhabra
[1] F. Jelinek,et al. Perplexity—a measure of the difficulty of speech recognition tasks , 1977 .
[2] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[3] Petros Maragos,et al. On amplitude and frequency demodulation using energy operators , 1993, IEEE Trans. Signal Process..
[4] David Poeppel,et al. Concurrent encoding of frequency and amplitude modulation in human auditory cortex: MEG evidence. , 2006, Journal of neurophysiology.
[5] Alfred Mertins,et al. Analysis and design of gammatone signal models. , 2009, The Journal of the Acoustical Society of America.
[6] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[7] Haihua Xu,et al. Minimum Bayes Risk decoding and system combination based on a recursion for edit distance , 2011, Comput. Speech Lang..
[8] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[9] Marie-Francine Moens,et al. A survey on the application of recurrent neural networks to statistical language modeling , 2015, Comput. Speech Lang..
[10] R. Chitturi,et al. Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems , 2005 .
[11] Hemant A. Patil,et al. Novel Unsupervised Auditory Filterbank Learning Using Convolutional RBM for Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] David Poeppel,et al. Neuronal oscillations and speech perception: critical-band temporal envelopes are the essence , 2013, Front. Hum. Neurosci..
[13] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[14] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[15] Hardik B Sailor,et al. Auditory feature representation using convolutional restricted Boltzmann machine and Teager energy operator for speech recognition. , 2017, The Journal of the Acoustical Society of America.
[16] R. Schlauch,et al. Basilar membrane nonlinearity and loudness. , 1998, The Journal of the Acoustical Society of America.
[17] Hemant A. Patil,et al. Development of speech corpora in Gujarati and Marathi for phonetic transcription , 2013, 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE).
[18] Brian Kingsbury,et al. Multilingual representations for low resource speech recognition and keyword search , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[19] Mark J. F. Gales,et al. Recurrent neural network language model training with noise contrastive estimation for speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Hermann Ney,et al. Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[21] R V Shannon,et al. Speech Recognition with Primarily Temporal Cues , 1995, Science.
[22] Yiming Wang,et al. Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs , 2018, IEEE Signal Processing Letters.
[23] R. Plomp,et al. Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.
[24] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[25] Thomas Quatieri,et al. Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .
[26] Avni Rajpal,et al. Unsupervised Filterbank Learning for Speech-based Access System for Agricultural Commodity , 2017, 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR).
[27] Alan W Black,et al. The Festvox Indic Frontend for Grapheme-to-Phoneme Conversion , 2016 .
[28] J. F. Kaiser,et al. On a simple algorithm to calculate the 'energy' of a signal , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[29] Fan-Gang Zeng,et al. Speech recognition with amplitude and frequency modulations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.