Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach

This paper presents the effectiveness of perceptual features and iterative clustering approach for performing both speech and speaker recognition. Procedure used for formation of training speech is different for developing training models for speaker independent speech and text independent speaker recognition. So, this work mainly emphasizes the utilization of clustering models developed for the training data to obtain better accuracy as 91%, 91% and 99.5% for mel frequency perceptual linear predictive cepstrum with respect to three categories such as speaker identification, isolated digit recognition and continuous speech recognition. This feature also produces 9% as low equal error rate which is used as a performance measure for speaker verification. The work is experimentally evaluated on the set of isolated digits and continuous speeches from TI digits_1 and TI digits_2 database for speech recognition and on speeches of 50 speakers randomly chosen from TIMIT database for speaker recognition. The noteworthy feature of speaker recognition algorithm is to evaluate the testing procedure on identical messages of all the 50 speakers, theoretical validation of results using F-ratio and validation of results by statistical analysis using 2 χ distribution.

[1]  Waleed H. Abdulla,et al.  Robust speaker modeling using perceptually motivated feature , 2007, Pattern Recognit. Lett..

[2]  M. H. Hamza Proceedings of the 24th IASTED international conference on Signal processing, pattern recognition, and applications , 2006 .

[3]  Steve J. Young,et al.  Large vocabulary continuous speech recognition using HTK , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  S. Das,et al.  A scheme for speech processing in automatic speaker verification , 1971 .

[5]  Y. Venkataramani,et al.  Text Independent Composite Speaker Identification/Verification Using Multiple Features , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[6]  Y. Venkataramani,et al.  Effectiveness of LP Derived Features and DCTC in Twins Identification - Iterative Speaker Clustering Approach , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[7]  Y. Venkataramani,et al.  Iterative Clustering Approach for Text Independent Speaker Identification using Multiple Features , 2008, 2008 2nd International Conference on Signal Processing and Communication Systems.

[8]  Chungyong Lee,et al.  Optimizing feature extraction for speech recognition , 2003, IEEE Trans. Speech Audio Process..

[9]  Anupam Shukla,et al.  Multi Lingual Speaker Recognition Using Artificial Neural Network , 2009 .

[10]  Hynek Hermansky,et al.  The challenge of inverse-E: the RASTA-PLP method , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.

[11]  L. Venkata Subramaniam,et al.  Large vocabulary audio-visual speech recognition using active shape models , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[12]  Hui Jiang,et al.  Large margin hidden Markov models for speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Shoujue Wang,et al.  Research on Chinese Digit Speech Recognition Based on Multi-Weighted Neural Network , 2008, 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application.

[14]  Y. Venkataramani,et al.  Use of perceptual features in iterative clustering based twins identification system , 2008, 2008 International Conference on Computing, Communication and Networking.

[15]  牧野 正三 Perceptually based processing in automatic speech recognition , 1986 .

[16]  Engin Avci,et al.  The speaker identification by using genetic wavelet adaptive network based fuzzy inference system , 2009, Expert Syst. Appl..

[17]  S. Guruprasad,et al.  AANN models for speaker recognition based on difference cepstrals , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[18]  Aaron E. Rosenberg,et al.  New techniques for automatic speaker verification , 1975 .

[19]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[20]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[21]  Vassilis Kekatos,et al.  Combined Speech Recognition and Speaker Verification over the Fixed and Mobile Telephone Networks , 2006, SPPRA.