A Study of Speech Feature Extraction Based on Manifold Learning

Manifold learning is a nonlinear data dimension reduction method. It can look for the essence of things from the observed phenomena, and find the inherent law of data. Traditional MFCC feature will lead a slower learning speed on account of it has high dimension and useless noise. Therefore, a speech feature extraction method based on manifold learning is proposed. Firstly, we use the manifold learning dimension reduction algorithm for the dimension reduction of Mel features and then for vowels classification. In order to further demonstrate the effectiveness of manifold learning feature in speech recognition, we propose a fusion speech feature extraction method and apply it to the identification of Chinese isolated words. Experiments prove that the fusion feature extraction method has achieved a better result than that of traditional MFCC feature extraction method.

[1]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[2]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[3]  Andrew Errity,et al.  An investigation of manifold learning for speech analysis , 2006, INTERSPEECH.

[4]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[5]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[6]  Harshita Gupta,et al.  LPC and LPCC method of feature extraction in Speech Recognition System , 2016, 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence).

[7]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[8]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[9]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[10]  Aren Jansen,et al.  Intrinsic Fourier Analysis on the Manifold of Speech Sounds , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[12]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[13]  The application research of speech feature extraction based on the manifold learning , 2013 .