Human computer interaction using isolated-words speech recognition technology

This research paper aims to develop an isolated-word automatic speech recognition (IWASR) system based on vector quantization (VQ). This system receives, analyzes, searches and matches an input speech signal with the trained set of speech signals which are stored in the database/codebook, and returns matching results to users. IWASR is meant to assist customers calling a universitypsilas telephone operator to respond to their enquiries in a convenient way using their natural speech. Callers are assisted to select language, faculty and the staff name they wish to contact. To extract features from speech signals, Mel-frequency cepstral coefficients (MFCC) algorithm was applied. Subsequently, vector quantization was used for all feature vectors generated from the MFCC. A codebook was resulted from training the VQ initial codebook and experimental results showed that the recognition rate has been improved with the increase of codebook size and showed that the codebook size of 81 feature vectors had a recognition rate exceeded 85%.

[1]  Hamid Sheikhzadeh,et al.  An efficient front-end for automatic speech recognition , 2003, 10th IEEE International Conference on Electronics, Circuits and Systems, 2003. ICECS 2003. Proceedings of the 2003.

[2]  Ben P. Milner A comparison of front-end configurations for robust speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ruhul Amin BENGALI TEXT DEPENDENT SPEAKER IDENTIFICATION USING MELFREQUENCY CEPSTRUM COEFFICIENT AND VECTOR QUANTIZATION , 2004 .

[4]  Saifur Rahman,et al.  SPEAKER IDENTIFICATION USING MEL FREQUENCY CEPSTRAL COEFFICIENTS , 2004 .

[5]  C. Buchanan Informatics Research Proposal-Modelling the Semantics of Sound , 2005 .

[6]  Sushil Kumar Podder,et al.  Segment-based Stochastic Modelings for Speech Recognition , 1998 .

[7]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[8]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[9]  Christian Hacker,et al.  Revising Perceptual Linear Prediction (PLP) , 2005, INTERSPEECH.

[10]  F. Karray,et al.  Performance improvement of automatic speech recognition systems via multiple language models produced by sentence-based clustering , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[11]  Markus Forsberg Why is Speech Recognition Difficult? , 2003 .

[12]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[13]  Muhirwe Jackson AUTOMATIC SPEECH RECOGNITION: HUMAN COMPUTER INTERFACE FOR KINYARWANDA LANGUAGE , 2005 .