Speaker similarity score based fast phoneme classification by using neighborhood components analysis

K-nearest neighbor (k-NN) classifier can learn non-linear decision surface and requires only one hyperparameter (i.e. value of “k”) for training. The classification accuracy improves as we increase the amount of training data. With an increase in the amount of training data, computational and memory requirements also increases as it has to store and search through the entire training data for classification of one test point. In this paper we investigate the computational and memory cost of speaker similarity score algorithm for phoneme classification by doing dimensionality reduction based on neighborhood components analysis. Our speaker similarity score algorithm uses k-NN for learning a speaker similarity score of a target speaker and uses this score for weighted k-NN phoneme classification. By using neighborhood components analysis we obtained a significant reduction in computational and memory requirements at the expense of a small phoneme classification performance loss. Experiments on TIMIT dataset shows 56% reduction in computational cost as we reduced the dimensions of our feature space from 50 to 22.

[1]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[2]  A. Ganapathiraju,et al.  LINEAR DISCRIMINANT ANALYSIS - A BRIEF TUTORIAL , 1995 .

[3]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[4]  David V. Anderson,et al.  Speaker Adaptation Using Speaker Similarity Score on DNN Features , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[5]  David V. Anderson,et al.  Using k-Nearest Neighbor and Speaker Ranking for Phoneme Prediction , 2014, 2014 13th International Conference on Machine Learning and Applications.

[6]  References , 1971 .

[7]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[8]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[9]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[10]  Kilian Q. Weinberger,et al.  Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.

[11]  Timothy J. Hazen,et al.  Dimensionality reduction for speech recognition using neighborhood components analysis , 2007, INTERSPEECH.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[14]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[15]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .