Speaker Independent Speech Recognition of Isolated Words in Room Environment

In this paper, the process of recognizing some important words from a large set of vocabularies is demonstrated based on the combination of dynamic and instantaneous features of the speech spectrum. There are many procedures to recognize a word by its vowel but this paper presents the highly effective speaker independent speech recognition in a typical room environment noise cases. To distinguish several isolated words of the sound of different vowels, two important features known as Pitch and Formant are extracted from the speech signals collected from a number of random male and female speakers. The extracted features are then analysed for the particular utterances to train the system. The specific objectives of this work are to implement an isolated and automatic word speech recognizer, which is capable of recognizing as well as responding to speech and an audio interfacing system between human and machine for an effective human-machine interaction. The whole system has been tested using computer codes and the result was satisfactory in almost 90% of cases. However, system might get confused by similar vowel sounds sometimes.

[1]  Constantine Kotropoulos,et al.  Mobile phone identification using recorded speech signals , 2014, 2014 19th International Conference on Digital Signal Processing.

[2]  Dan Xu,et al.  Noise-robust pitch detection algorithm based on AMDF with clustering analysis picking peaks , 2016, 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference.

[3]  E. S. Gopi Digital Speech Processing Using Matlab , 2013 .

[4]  Liang He,et al.  Voice activity detection algorithm based on long-term pitch information , 2016, EURASIP J. Audio Speech Music. Process..

[5]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[6]  Yusuf Hendrawan,et al.  Optimization of PID Controller Parameters on Flow Rate Control System Using Multiple Effect Evaporator Particle Swarm Optimization , 2015 .

[7]  D. A. van Leeuwen,et al.  Speech and Audio Signal Processing , 2011 .

[8]  Antanas Verikas,et al.  Fusion of voice signal information for detection of mild laryngeal pathology , 2014, Appl. Soft Comput..

[9]  Vasif V. Nabiyev,et al.  Gender identification of a speaker from voice source , 2013, 2013 21st Signal Processing and Communications Applications Conference (SIU).

[10]  Alfred Mertins,et al.  Contextual invariant-integration features for improved speaker-independent speech recognition , 2011, Speech Commun..

[11]  Oyas Wahyunggoro,et al.  Mel-frequencies Stochastic Model for Gender Classification based on Pitch and Formant , 2016 .

[12]  Tomi Kinnunen,et al.  Local spectral variability features for speaker verification , 2016, Digit. Signal Process..

[13]  Dashun Que,et al.  Design and Implementation of Voice Conversion System Based on GMM and ANN , 2012, MMSP 2012.