Database development and automatic speech recognition of isolated Pashto spoken digits using MFCC and K-NN

Automatic recognition of isolated spoken digits is one of the most challenging tasks in the area of Automatic Speech Recognition. In this paper, Database Development and Automatic Speech Recognition of Isolated Pashto Spoken Digits from Sefer (0) to Naha (9) has been presented. A number of 50 individual Pashto native speakers (25 male and 25 female) of different ages, ranging from 18 to 60 years, were involved to utter from Sefer (0) to Naha (9) digits separately. Sony PCM-M 10 linear recorder is used for recoding purpose in the office and home in noise free environment. Adobe audition version 1.0 is used to split the audio of digits into individual digits and result is saved in .wav format. Mel frequency cepstral coefficients is used to extract speech features. K nearest neighbor classifier is used for the first time up to author knowledge in Pashto language to classify the features of speech and compare its accuracy with linear discriminate analysis. The experimental results are evaluated, and the overall average recognition exactitude of 76.8 % is obtained.

[1]  N. Ádám A Speech Analysis System Based on Vector Quantization Using the LBG Algorithm and Self- Organizing Maps , 2014 .

[2]  N K Narayanan,et al.  SEARCH TIME REDUCTION USING HIDDEN MARKOV MODELS FOR ISOLATED DIGIT RECOGNITION , 2013, ICIT 2013.

[3]  Nasir Ahmad,et al.  Concatenative based Pashto Digits and Numbers Synthesizer , 2013 .

[4]  Jia Pei,et al.  Automatic Speech Recognition , 2010 .

[5]  N. Ahmad,et al.  Pashto Spoken Digits database for the automatic speech recognition research , 2012, 18th International Conference on Automation and Computing (ICAC).

[6]  A. Hussain,et al.  Hierarchical K-Means Algorithm Applied On Isolated Malay Digit Speech Recognit ion , 2012 .

[7]  Ghulam Muhammad,et al.  Automatic speech recognition for Bangla digits , 2009, 2009 12th International Conference on Computers and Information Technology.

[8]  K.Usha Rani R.Deepika P.Kokila S.Karpagavalli Isolated Tamil Digits Speech Recognition using Vector Quantization , 2012 .

[9]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[10]  Jack Halpern The Challenges and Pitfalls of Arabic Romanization and Arabization , 2007 .

[11]  Zahoor Jan,et al.  Seasonal to Inter-annual Climate Prediction Using Data Mining KNN Technique , 2008, IMTIC.

[12]  R.Karthiprakash,et al.  AN APPROACH TO FEATURESELECTION ALGORITHM BASED ONANT COLONY OPTIMIZATION FORAUTOMATIC SPEECH RECOGNITION , 2013 .

[13]  M. Kalamani,et al.  AN APPROACH TO FEATURE SELECTION ALGORITHM BASED ON ANT COLONY OPTIMIZATION FOR AUTOMATIC SPEECH RECOGNITION , 2013 .

[14]  Noelia Alcaraz Meseguer Speech Analysis for Automatic Speech Recognition , 2009 .

[15]  Yousef Ajami Alotaibi High performance Arabic digits recognizer using neural networks , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[16]  Cini Kurian,et al.  Isolated Malayalam digit recogntion using Support Vector Machines , 2010, 2010 INTERNATIONAL CONFERENCE ON COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES.

[17]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[18]  Stavros Tsakalidis,et al.  Pashto speech recognition with limited pronunciation lexicon , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.