Speech Recognition with Combined MFCC, MODGDF and ZCPA Features Extraction Techniques Using NTN and MNTN Conventional Classifiers for Telugu Language

The automatic speech recognition systems are designed for human–computer interaction in trouble-free mode. Speech recognition has vast applications. Text to speech and speech to text transformations are mostly used segments in ASR. Perfect speech recognition is done by choosing proper extraction and classification techniques with respect to slang and pitch of the language. Telugu is a south Indian language which has around 120 million speakers. There are various feature extraction techniques such as LPC, MFCC, MODGDF, RASTA, DTW and ZCPA. This paper deals with the combined techniques and its comparison with individual techniques. The rate of features extracted in joint extraction techniques gives promising results comparatively with individual technique. Techniques MFCC, MODGDF and ZCPA are combined, and joint features are extracted. The next stage is to classify the features where selected technique using neural networks. Features are classified by NTN and MNTN classifiers for speaker-dependent recognition and presented by using closed and open sets. The MNTN is evaluated for several speaker recognition experiments. These include closed- and open-set speaker identification and speaker verification. The MNTN is found to perform better than NTN classifier. Speech recognition rate for Telugu language by combine MFCC, MODGDF and ZCPA extraction techniques using NTN and MNTN classification techniques are compared with excellent results.

[1]  K. P. Rajesh,et al.  An efficient method for Tamil speech recognition using MFCC and DTW for mobile applications , 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES.

[2]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[3]  Deepika Bansal,et al.  Digital Arithmetic Coding with AES Algorithm , 2013 .

[4]  Mark A Gregory,et al.  A novel approach for MFCC feature extraction , 2010, 2010 4th International Conference on Signal Processing and Communication Systems.

[5]  Ashwani Kumar Yadav,et al.  Segmentation on moving shadow detection and removal by symlet transform for vehicle detection , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[6]  G. Rozinaj,et al.  ZCPA features for speech recognition , 2012, 2012 IX International Symposium on Telecommunications (BIHTEL).

[7]  K. Poulose Jacob,et al.  PERFORMANCE OF DIFFERENT CLASSIFIERS IN SPEECH RECOGNITION , 2013 .

[8]  Richard J. Mammone,et al.  Speaker recognition using neural networks and conventional classifiers , 1994, IEEE Trans. Speech Audio Process..

[9]  Abhay Sharma,et al.  Speech recognition using arithmetic coding and MFCC for Telugu language , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[10]  Satyanand Singh,et al.  Vector Quantization Approach for Speaker Recognition using MFCC and Inverted MFCC , 2011 .

[11]  K. V. N. Sunitha,et al.  Syllable Analysis to Build a Dictation System in Telugu language , 2010, ArXiv.

[12]  Ashwani Kumar Yadav,et al.  Algorithm for de-noising of color images based on median filter , 2015, 2015 Third International Conference on Image Information Processing (ICIIP).