Adaptive neuro-fuzzy inference system for evaluating dysarthric automatic speech recognition (ASR) systems: a case study on MVML-based ASR

Due to the improvements of dysarthric automatic speech recognition (ASR) during the last few decades, the demand for assessment and evaluation of such technologies increased significantly. Evaluation methods of ASRs are now required to consider multiple qualitative and quantitative metrics. In this study, the exploratory factor analysis is conducted to classify the evaluation metrics that is applied by researchers. The metrics with high Pearson correlation coefficiency ($$r > .9$$r>.9) are placed in same groups so the number of metrics from 23 is reduced to six main metrics. Artificial neural networks (ANNs) do not require any internal knowledge of system parameters and provide solutions for problems with multi-variables while delivering speedy calculations; hence, they can be used as an alternative to analytical approaches based on obtained evaluation metrics. Here, the adaptive neuro-fuzzy inference system (ANFIS) was employed for ASR performance evaluation in which it applies an ANN to estimate the fuzzy logic membership function parameters of the fuzzy inference system (FIS). The proposed algorithm was deployed in MATLAB and employed to measure the performances of two dysarthric ARS systems based on MVML and MVSL active learning theories. The assessment results presented in this paper show the effectiveness of the developed method.

[1]  Dalibor Petkovic,et al.  Adaptive neuro-fuzzy estimation of autonomic nervous system parameters effect on heart rate variability , 2011, Neural Computing and Applications.

[2]  Zhi-Hua Zhou,et al.  On multi-view active learning and the combination with semi-supervised learning , 2008, ICML '08.

[3]  Adeleh Asemi,et al.  Intelligent MCDM method for supplier selection under fuzzy environment , 2014 .

[4]  Prasad D Polur,et al.  Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model. , 2005, Journal of rehabilitation research and development.

[5]  Jorge L. Martínez,et al.  Pure-Pursuit Reactive Path Tracking for Nonholonomic Mobile Robots with a 2D Laser Scanner , 2009, EURASIP J. Adv. Signal Process..

[6]  Betul Bektas Ekici,et al.  Prediction of building energy needs in early stage of design by using ANFIS , 2011, Expert Syst. Appl..

[7]  Thomas S. Huang,et al.  Dysarthric speech database for universal access research , 2008, INTERSPEECH.

[8]  Seyed Reza Shahamiri,et al.  A Multi-Views Multi-Learners Approach Towards Dysarthric Speech Recognition Using Multi-Nets Artificial Neural Networks , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[9]  M. Grant,et al.  User satisfaction and sustainability of drinking water schemes in rural communities of Nepal , 2007 .

[10]  Frank Rudzicz,et al.  Using articulatory likelihoods in the recognition of dysarthric speech , 2012, Speech Commun..

[11]  Sameem Abdul Kareem,et al.  A Fuzzy Inference System for Skeletal Age Assessment in Living Individual , 2017, Int. J. Fuzzy Syst..

[12]  Laila Dybkjær,et al.  Usability Evaluation in Spoken Language Dialogue Systems , 2001, ACL 2001.

[13]  Stephen J. Cox,et al.  Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers , 2009, EURASIP J. Adv. Signal Process..

[14]  Thomas S. Huang,et al.  Hmm-Based and Svm-Based Recognition of the Speech of Talkers With Spastic Dysarthria , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[15]  Khaled Assaleh,et al.  Recognition of Arabic Sign Language Alphabet Using Polynomial Classifiers , 2005, EURASIP J. Adv. Signal Process..

[16]  Georgios Kouroupetroglou,et al.  Spoken Dialogue Interfaces: Integrating Usability , 2009, USAB.

[17]  Shiliang Sun,et al.  Multiple-view multiple-learner active learning , 2010, Pattern Recognit..

[18]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[19]  James T. Miller,et al.  An Empirical Evaluation of the System Usability Scale , 2008, Int. J. Hum. Comput. Interact..

[20]  Douglas D. O'Shaughnessy,et al.  Alternative Speech Communication System for Persons with Severe Speech Disorders , 2009, EURASIP J. Adv. Signal Process..

[21]  Mirna Issa,et al.  Adaptive neuro fuzzy controller for adaptive compliant robotic gripper , 2012, Expert Syst. Appl..

[22]  Radek Martinek,et al.  Testing of the voice communication in smart home care , 2015, Human-centric Computing and Information Sciences.

[23]  Babak Rezaee,et al.  Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers , 2009, Expert Syst. Appl..

[24]  Saeed Setayeshi,et al.  Speech emotion recognition based on a modified brain emotional learning model , 2017, BICA 2017.

[25]  H. Harman Modern factor analysis , 1961 .

[26]  Javier Ruiz-del-Solar,et al.  Recognition of Faces in Unconstrained Environments: A Comparative Study , 2009, EURASIP J. Adv. Signal Process..

[27]  James Carmichael,et al.  A speech-controlled environmental control system for people with severe dysarthria. , 2007, Medical engineering & physics.

[28]  Mohd Sapiyan Baba,et al.  Fuzzy Multi Criteria Decision Making Applications: A Review Study. , 2014 .

[29]  Melih İnal,et al.  Determination of dielectric properties of insulator materials by means of ANFIS: A comparative study , 2008 .

[30]  Jace Wolfe,et al.  Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm. , 2015, Journal of the American Academy of Audiology.

[31]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[32]  Engin Avci,et al.  Speech recognition using a wavelet packet adaptive network based fuzzy inference system , 2006, Expert Syst. Appl..

[33]  G. Gunasekaran,et al.  Fuzzy Logic based Nam Speech Recognition for Tamil Syllables , 2016 .