Improved Mispronunciation detection system using a hybrid CTC-ATT based approach for L2 English speakers
暂无分享,去创建一个
[1] Gora Chand Nandi,et al. A Speech Recognition Technique Using MFCC with DWT in Isolated Hindi Words , 2013, ICACNI.
[2] Gora Chand Nandi,et al. Implementation of MFCC based hand gesture recognition on HOAP-2 using Webots platform , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).
[3] Gora Chand Nandi,et al. An efficient gesture based humanoid learning using wavelet descriptor and MFCC techniques , 2017, Int. J. Mach. Learn. Cybern..
[4] Berlin Chen,et al. An Effective End-to-End Modeling Approach for Mispronunciation Detection , 2020, INTERSPEECH.
[5] Ricardo Gutierrez-Osuna,et al. L2-ARCTIC: A Non-native English Speech Corpus , 2018, INTERSPEECH.
[6] Gora Chand Nandi,et al. A mathematical framework for possibility theory-based hidden Markov model , 2017, Int. J. Bio Inspired Comput..
[7] Berlin Chen,et al. An End-to-End Mispronunciation Detection System for L2 English Speech Leveraging Novel Anti-Phone Modeling , 2020, INTERSPEECH.
[8] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.
[9] Gora Chand Nandi,et al. Face liveness detection through face structure analysis , 2014, Int. J. Appl. Pattern Recognit..
[10] Gora Chand Nandi,et al. Development of a self reliant humanoid robot for sketch drawing , 2017, Multimedia Tools and Applications.
[11] Marcin Wlodarczak,et al. TextGridTools: A TextGrid Processing and Analysis Toolkit for Python , 2013 .
[12] Alex Graves,et al. Connectionist Temporal Classification , 2012 .
[13] Shuang Zhang,et al. Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system , 2010, INTERSPEECH.
[14] Xunying Liu,et al. CNN-RNN-CTC Based End-to-end Mispronunciation Detection and Diagnosis , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Gora Chand Nandi,et al. A rough set based reasoning approach for criminal identification , 2019, Int. J. Mach. Learn. Cybern..
[17] Expression invariant fragmented face recognition , 2014, 2014 International Conference on Signal Propagation and Computer Technology (ICSPCT 2014).
[18] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[19] Shweta Tripathi,et al. A speaker invariant speech recognition technique using HFCC features in isolated Hindi words , 2014, Int. J. Comput. Intell. Stud..
[20] Gora Chand Nandi,et al. Human perception based criminal identification through human robot interaction , 2015, 2015 Eighth International Conference on Contemporary Computing (IC3).
[21] Gora Chand Nandi,et al. Development of a Fuzzy Expert System based Liveliness Detection Scheme for Biometric Authentication , 2016, ArXiv.
[22] Gora Chand Nandi,et al. Real‐Time Gesture–Based Communication Using Possibility Theory–Based Hidden Markov Model , 2017, Comput. Intell..
[23] Nikita P. Desai,et al. Mel Frequency Cepstral Coefficients (MFCC) based speaker identification in noisy environment using wiener filter , 2014, 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE).
[24] Yong Wang,et al. Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers , 2015, Speech Commun..
[25] Kai-Florian Richter,et al. Towards Verbal Explanations by Collaborating Robot Teams , 2019 .
[26] Frank K. Soong,et al. Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT) , 2010, INTERSPEECH.
[27] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[28] Gora Chand Nandi,et al. Continuous dynamic Indian Sign Language gesture recognition with invariant backgrounds , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).
[29] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[30] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[31] Thomas Hellström,et al. Fusion of Gesture and Speech for Increased Accuracy in Human Robot Interaction , 2019, 2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR).
[32] Kamalika Datta,et al. Peak Detection based Spread Spectrum Audio Watermarking using Discrete Wavelet Transform , 2011 .
[33] Kai-Florian Richter,et al. Verbal explanations by collaborating robot teams , 2020, Paladyn J. Behav. Robotics.
[34] Tara N. Sainath,et al. Deep Learning for Audio Signal Processing , 2019, IEEE Journal of Selected Topics in Signal Processing.
[35] Kai Chen,et al. SED-MDD: Towards Sentence Dependent End-To-End Mispronunciation Detection and Diagnosis , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] D. Subashini,et al. Automated Speech Recognition System – A Literature Review , 2017 .
[37] Kai-Florian Richter,et al. An Empirical Review of Calibration Techniques for the Pepper Humanoid Robot's RGB and Depth Camera , 2019, IntelliSys.
[38] Gora Chand Nandi,et al. NAO humanoid robot: Analysis of calibration techniques for robot sketch drawing , 2016, Robotics Auton. Syst..
[39] G. C. Nandi,et al. Sketch drawing by NAO humanoid robot , 2015, TENCON 2015 - 2015 IEEE Region 10 Conference.
[40] Gora Chand Nandi,et al. Development of a Framework for Human–Robot interactions with Indian Sign Language Using Possibility Theory , 2017, Int. J. Soc. Robotics.
[41] Gora Chand Nandi,et al. Visual perception-based criminal identification: a query-based approach , 2017, J. Exp. Theor. Artif. Intell..
[42] Kun Li,et al. Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[43] G. C. Nandi,et al. A MFCC based Hindi speech recognition technique using HTK Toolkit , 2013, 2013 IEEE Second International Conference on Image Information Processing (ICIIP-2013).
[44] Yogesh Kumar,et al. A Comprehensive View of Automatic Speech Recognition System - A Systematic Literature Review , 2019, 2019 International Conference on Automation, Computational and Technology Management (ICACTM).
[45] Kai-Florian Richter,et al. Understandable Collaborating Robot Teams , 2020, PAAMS.
[46] G. C. Nandi,et al. Implementation and evaluation of DWT and MFCC based ISL gesture recognition , 2014, 2014 9th International Conference on Industrial and Information Systems (ICIIS).
[47] Kai-Florian Richter,et al. A Fuzzy Inference System for a Visually Grounded Robot State of Mind , 2020, ECAI.
[48] Steve J. Young,et al. Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..
[49] Avinash Kumar Singh,et al. Extracting Primary Objects and Spatial Relations from Sentences , 2019, ICAART.
[50] Wei Li,et al. Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[51] G. C. Nandi,et al. Possibility theory based continuous Indian Sign Language gesture recognition , 2015, TENCON 2015 - 2015 IEEE Region 10 Conference.
[52] Kai-Florian Richter,et al. Understandable Teams of Pepper Robots , 2020, PAAMS.
[53] Gora Chand Nandi,et al. Face recognition using facial symmetry , 2012, CCSEIT '12.
[54] G. C. Nandi,et al. Face recognition with liveness detection using eye and mouth movement , 2014, 2014 International Conference on Signal Propagation and Computer Technology (ICSPCT 2014).
[55] Long Zhang,et al. End-to-End Automatic Pronunciation Error Detection Based on Improved Hybrid CTC/Attention Architecture , 2020, Sensors.
[56] Kamalika Datta,et al. Comparative study of spread spectrum based audio watermarking techniques , 2011, 2011 International Conference on Recent Trends in Information Technology (ICRTIT).