Comparative Analysis of Different Feature Extraction and Classifier Techniques for Speaker Identification Systems : A Review

Speech recognition is a natural means of interaction for a human with a smart assistive environment. In order for this interaction to be effective, such a system should attain a high recognition rate even under adverse conditions. In Speech Recognition speech signals are automatically converted into the corresponding sequence of words in text. When the training and testing conditions are not similar, statistical speech recognition algorithms suffer from severe degradation in recognition accuracy. So we depend on intelligent and recognizable sounds for common communications. In this paper, we first give a brief overview of Speech Recognition and then describe some feature extraction and classifier technique. We have compared MFCC, LPC and PLP feature extraction techniques. We efficiently tested the performance of MFCC is more efficient and accurate then LPC and PLP feature extraction technique in voice recognition and thus more suitable for practical applications.

[1]  I. Gavat,et al.  A Comparative Study of Feature Extraction Methods Applied to Continuous Speech Recognition in Romanian Language , 2006, Proceedings ELMAR 2006.

[2]  James L. McClelland Parallel Distributed Processing , 2005 .

[3]  Sadaoki Furui,et al.  Fifty years of progress in speech and speaker recognition , 2004 .

[4]  A Review on Speech Recognition Challenges and Approaches , 2012 .

[5]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[6]  Frank K. Soong,et al.  High performance connected digit recognition, using hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  D.P. Morgan,et al.  The application of dynamic programming to connected speech recognition , 1990, IEEE ASSP Magazine.

[8]  N. Uma Maheswari,et al.  A Hybrid model of Neural Network Approach for Speaker independent Word Recognition , 2010 .

[9]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[10]  Yonghong Yan,et al.  Speech recognition using neural networks with forward-backward probability generated targets , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Biing-Hwang Juang,et al.  HMM clustering for connected word recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[12]  Douglas D. O'Shaughnessy,et al.  Interacting with computers by voice: automatic speech recognition and synthesis , 2003, Proc. IEEE.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  G.R. Doddington,et al.  Speaker recognition—Identifying people by their voices , 1985, Proceedings of the IEEE.

[15]  Stephen E. Levinson,et al.  A speaker-independent, syntax-directed, connected word recognition system based on hidden Markov models and level building , 1985, IEEE Trans. Acoust. Speech Signal Process..

[16]  J. Oglesby,et al.  Optimisation of neural models for speaker identification , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[17]  Doreen Meier,et al.  Fundamentals Of Neural Networks Architectures Algorithms And Applications , 2016 .

[18]  Om Prakash Prabhakar,et al.  A Survey On: Voice Command Recognition Technique , 2013 .

[19]  Stuart E. Dreyfus,et al.  Applied Dynamic Programming , 1965 .