论文信息 - RECONSTRUCTED STATE SPACE MODEL FOR RECOGNITION OF CONSONANT - VOWEL (CV) UTTERANCES USING SUPPORT VECTOR MACHINES

RECONSTRUCTED STATE SPACE MODEL FOR RECOGNITION OF CONSONANT - VOWEL (CV) UTTERANCES USING SUPPORT VECTOR MACHINES

This paper presents a study on the use of Support Vector Machines (SVMs) in classifying Malayalam Consonant – Vowel (CV) speech unit by comparing it to two other classification algorithms namely Artificial Neural Network (ANN) and k – Nearest Neighbourhood (k – NN). We extend SVM to combine many two class classifiers into multiclass classifier using Decision Directed Acyclic Graph (DDAG) algorithm. A feature extraction technique using Reconstructed State Space(RSS) based State Space Point Distribution (SSPD) parameters are studied. We obtain an average recognition accuracy of 90% using SSPD for SVM based Malayalam CV speech unit database in speaker independent environments. The result shows that the efficiency of the proposed technique is capable for increasing speaker independent consonant speech recognition accuracy and can be effectively used for developing a complete speech recognition system for Malayalam language.

T. M. Thasleema | N. K. Narayanan | P. Prajith | N. Narayanan | P. Prajith

[1] N. S. Sreekanth,et al. Phase Space Parameters for Neural Network Based Vowel Recognition , 2004, ICONIP.

[2] Simon King,et al. Speech and Audio Signal Processing , 2011 .

[3] Brian D. Ripley,et al. Pattern Recognition and Neural Networks , 1996 .

[4] Rashmi Malhotra. Artificial Neural Systems in Commercial Lending , 1994 .

[5] Federico Girosi,et al. Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] B. Chatterjee,et al. Design of a Nearest Neighbour Classifier System for Bengali Character Recognition , 1984 .

[7] Hsin-Hsi Chen,et al. An Introduction to Natural Language Processing , 1989 .

[8] G. P. King,et al. Extracting qualitative dynamics from experimental data , 1986 .

[9] H. M. Teager,et al. Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract , 1990 .

[10] O. Casha,et al. Neural network architectures for speaker independent phoneme recognition , 2011, 2011 7th International Symposium on Image and Signal Processing and Analysis (ISPA).

[11] David J. Hand,et al. Discrimination and Classification , 1982 .

[12] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[13] Carolina González. Vowels and Consonants: An Introduction to the Sounds of Language (review) , 2007 .

[14] R. Anitha,et al. Outerproduct of trajectory matrix for acoustic modeling using support vector machines , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[15] R. Lippmann,et al. An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[16] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[17] M. Casdagli. Chaos and Deterministic Versus Stochastic Non‐Linear Modelling , 1992 .

[18] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .

[19] P. Ladefoged. Vowels and consonants : an introduction to the sounds of languages , 2001 .

[20] Richard J. Povinelli,et al. Time-domain isolated phoneme classification using reconstructed phase spaces , 2005, IEEE Transactions on Speech and Audio Processing.

[21] Bayya Yegnanarayana,et al. A constraint satisfaction model for recognition of stop consonant-vowel (SCV) utterances , 2002, IEEE Trans. Speech Audio Process..

[22] Steven Greenberg,et al. Speaking in shorthand - A syllable-centric perspective for understanding pronunciation variation , 1999, Speech Commun..

[23] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[24] Jun Wang,et al. A support vector machine with a hybrid kernel and minimal Vapnik-Chervonenkis dimension , 2004, IEEE Transactions on Knowledge and Data Engineering.

[25] Abraham Kandel,et al. Introduction to Pattern Recognition: Statistical, Structural, Neural and Fuzzy Logic Approaches , 1999 .

[26] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[27] Kuldip Singh,et al. A Time-Series-Based Feature Extraction Approach for Prediction of Protein Structural Class , 2008, EURASIP J. Bioinform. Syst. Biol..

[28] Sankar K. Pal,et al. Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[29] Howell Tong,et al. Non-linear time series analysis , 2005 .

[30] Gilberto De Paiva. Pattern Recognition Principle Theoretical Model of Mind-Brain Functioning , 2010 .

[31] Massimiliano Pontil,et al. Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[32] Sargur N. Srihari,et al. Fast k-nearest neighbor classification using cluster-based trees , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[34] Gregory L. Baker,et al. Chaotic Dynamics: An Introduction , 1990 .

[35] W. Pitts,et al. A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[36] D. A. van Leeuwen,et al. Speech and Audio Signal Processing , 2011 .

[37] Franz Pernkopf,et al. Bayesian network classifiers versus selective k-NN classifier , 2005, Pattern Recognit..

[38] D. R. Reddy. An approach to computer speech recognition by direct analysis of the speech wave , 1966 .

[39] E. Ott. Chaos in Dynamical Systems: Contents , 1993 .

[40] Oh-Wook Kwon,et al. Speech Feature Analysis Using , 2003 .

[41] Hamid Sheikhzadeh,et al. Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization , 1994, IEEE Trans. Speech Audio Process..

[42] J. Forgie,et al. Results Obtained from a Vowel Recognition Computer Program , 1959 .

[43] Steve McLaughlin,et al. IEE Colloquium on "Exploiting Chaos in Signal Processing, Digest No 1994/193 , 1994 .

[44] Min-Chun Yu,et al. Multi-criteria ABC analysis using artificial-intelligence-based classification techniques , 2011, Expert Syst. Appl..

[45] P Prajith. Investigations on the applications of dynamical instabilities and deterministic chaos for speech signal processing , 2008 .

[46] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[47] Bernhard Schölkopf,et al. Support vector learning , 1997 .

[48] Slobodan Ribarić,et al. Introduction to Pattern Recognition , 1988 .

[49] Y. Wong,et al. Differentiable Manifolds , 2009 .

[50] James P. Crutchfield,et al. Geometry from a Time Series , 1980 .

[51] Nicholas Kalouptsidis,et al. Nearest neighbor pattern classification neural networks , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[52] Teuvo Kohonen,et al. An introduction to neural computing , 1988, Neural Networks.

[53] F. Takens. Detecting strange attractors in turbulence , 1981 .

[54] Ara Samouelian,et al. Knowledge based approach to consonant recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.