Classification of Spanish vowels and digits using neural networks

In this paper we describe the use of Multilayer Perceptron Array for learning and classifying speech signals, using characteristic vectors of reconstructed dynamics. First, we consider the phonatory system as a black box, where the only available data is its output: the speech signal. This is a way of accessing underlying dynamics, and is the starting point for two kinds of experiments: classification of vowels and digits, with Venezuelan Spanish voices. Results verify positively that characteristics vectors extracted from underlying dynamics hold discriminative power for distinguishing between classes of speech signals. Besides, neural networks are able to generalize using this kind of data.

[1]  Abraham Kandel,et al.  A fuzzy information space approach to speech signal non-linear analysis , 2000, Int. J. Intell. Syst..

[2]  Adrian J. Shepherd,et al.  Second-Order Methods for Neural Networks , 1997 .

[3]  Danilo P. Mandic,et al.  A differential entropy based method for determining the optimal embedding parameters of a signal , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Richard J. Povinelli,et al.  Phoneme classification using naive Bayes classifier in reconstructed phase space , 2002, Proceedings of 2002 IEEE 10th Digital Signal Processing Workshop, 2002 and the 2nd Signal Processing Education Workshop..

[5]  L. Tsimring,et al.  The analysis of observed chaotic data in physical systems , 1993 .

[6]  Abraham Kandel,et al.  Similarity of dynamical systems , 1998 .

[7]  James D. Hamilton Time Series Analysis , 1994 .

[8]  Feng Zhao,et al.  Extracting and Representing Qualitative Behaviors of Complex Systems in Phase Spaces , 1991, IJCAI.

[9]  F. Takens Detecting strange attractors in turbulence , 1981 .

[10]  David J. Hand,et al.  Intelligent Data Analysis: An Introduction , 2005 .