A Combined Rough Sets–K-means Vector Quantization Model for Arabic Speech Recognizer

Vector quantization (VQ), is considered an efficient data reduction technique, and is used as a preprocessing stage in speech recognition systems. Methods traditionally used for vector quantization are purely numerical methods rather than rule-based methods. Furthermore, most of the experiments performed in previous research work used English data sets, and fewer experiments on Arabic data. In this paper, a vector quantization model that incorporate rough sets attribute reduction and rules generation with a modified version of the K-means clustering algorithm was developed implemented and tested as a part of a speech recognition framework, the learning vector quantization.(LVQ) neural network model was used in the pattern matching stage. Arabic speech data was used in the original experiments, for both speaker dependant and speaker independent tests. The performance was found to be very high in terms of recognition time and

[1]  Wesam M. Ashour,et al.  Initializing K-Means Clustering Algorithm using Statistical Information , 2011 .

[2]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[3]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[5]  Ahmet M. Kondoz,et al.  Digital Speech: Coding for Low Bit Rate Communication Systems , 1995 .

[6]  R. Słowiński Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory , 1992 .

[7]  David M. Skapura,et al.  Neural networks - algorithms, applications, and programming techniques , 1991, Computation and neural systems series.

[8]  Bozena Kostek,et al.  Computer-Based Recognition of Musical Phrases Using The Rough-Set Approach , 1998, Inf. Sci..

[9]  D. J. Myers,et al.  Neural Networks for Vision, Speech, and Natural Language , 1992 .

[10]  Tien Ping Tan,et al.  Automatic Speech Recognition of Code Switching Speech Using 1-Best Rescoring , 2012, 2012 International Conference on Asian Language Processing.

[11]  Andreas Stolcke,et al.  Articulatory trajectories for large-vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Laurene V. Fausett,et al.  Fundamentals Of Neural Networks , 1994 .

[13]  Dimitra Vergyri,et al.  Medium-duration modulation cepstral feature for robust speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Sandeep Sharma,et al.  Speech Recognition System: A Review , 2015 .

[15]  Guoyin Wang,et al.  Speech Emotion Recognition Based on Rough Set and SVM , 2006, 2006 5th IEEE International Conference on Cognitive Informatics.

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[17]  M. Hasler,et al.  Neural networks in speech recognition , 1994 .

[18]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[19]  Salwani Abdullah,et al.  A combined approach for clustering based on K-means and gravitational search algorithms , 2012, Swarm Evol. Comput..