Application of independent component analysis to feature extraction of speech

We describe what characteristics an independent component analysis can extract from Japanese continuous speech. Speech data was selected from ATR database uttered by a female speaker. The data was recorded at 20 kHz sampling frequency and was pre-processed with a whitening filter. The learning algorithm of a network was an information-maximization approach proposed by Bell and Sejnowski (1995). After the learning, most of the basis functions that are columns of a mixing matrix were localized in both time and frequency. Furthermore, we confirmed that there were some basis functions to extract the acoustic feature such as the pitch and the formant of each vowel.

[1]  Juha Karhunen,et al.  Nonlinear PCA type approaches for source separation and independent component analysis , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[2]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[3]  Aapo Hyvärinen,et al.  A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[4]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[5]  Kiyotoshi Matsuoka,et al.  A neural net for blind separation of nonstationary signals , 1995, Neural Networks.

[6]  Juha Karhunen,et al.  Neural approaches to independent component analysis and source separation , 1996, ESANN.

[7]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[8]  Te-Won Lee,et al.  Independent Component Analysis , 1998, Springer US.

[9]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[10]  Erkki Oja,et al.  Signal Separation by Nonlinear Hebbian Learning , 1995 .

[11]  T J Sejnowski,et al.  Learning the higher-order structure of a natural sound. , 1996, Network.

[12]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[13]  Terrence J. Sejnowski,et al.  Learning Nonlinear Overcomplete Representations for Efficient Coding , 1997, NIPS.