Performance Analysis of Deep Autoencoder and NCA Dimensionality Reduction Techniques with KNN, ENN and SVM Classifiers

The central aim of this paper is to implement Deep Autoencoder and Neighborhood Components Analysis (NCA) dimensionality reduction methods in Matlab and to observe the application of these algorithms on nine unlike datasets from UCI machine learning repository. These datasets are CNAE9, Movement Libras, Pima Indians diabetes, Parkinsons, Knowledge, Segmentation, Seeds, Mammographic Masses, and Ionosphere. First of all, the dimension of these datasets has been reduced to fifty percent of their original dimension by selecting and extracting the most relevant and appropriate features or attributes using Deep Autoencoder and NCA dimensionality reduction techniques. Afterward, each dataset is classified applying K-Nearest Neighbors (KNN), Extended Nearest Neighbors (ENN) and Support Vector Machine (SVM) classification algorithms. All classification algorithms are developed in the Matlab environment. In each classification, the training test data ratio is always set to ninety percent: ten percent. Upon classification, variation between accuracies is observed and analyzed to find the degree of compatibility of each dimensionality reduction technique with each classifier and to evaluate each classifier performance on each dataset.

[1]  Bo Tang,et al.  ENN: Extended Nearest Neighbor Method for Pattern Recognition [Research Frontier] , 2015, IEEE Computational Intelligence Magazine.

[2]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[3]  Cheng-Yuan Liou,et al.  Autoencoder for words , 2014, Neurocomputing.

[4]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[5]  Volker Schmid,et al.  Pattern Recognition and Signal Analysis in Medical Imaging , 2003 .

[6]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[7]  Chiori Hori,et al.  Classification of children with voice impairments using deep neural networks , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[8]  Md. Abu Bakr Siddique,et al.  Study and Observation of the Variation of Accuracies of KNN, SVM, LMNN, ENN Algorithms on Eleven Different Datasets from UCI Machine Learning Repository , 2018, 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT).

[9]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[10]  Timothy J. Hazen,et al.  Dimensionality reduction for speech recognition using neighborhood components analysis , 2007, INTERSPEECH.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[13]  Pavel Pudil,et al.  Novel Methods for Feature Subset Selection with Respect to Problem Knowledge , 1998 .

[14]  Lei Zhu,et al.  Unsupervised neighborhood component analysis for clustering , 2015, Neurocomputing.

[15]  Fuzhen Zhuang,et al.  Supervised Representation Learning: Transfer Learning with Deep Autoencoders , 2015, IJCAI.

[16]  Justin Bayer,et al.  Efficient movement representation by embedding Dynamic Movement Primitives in deep autoencoders , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[17]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[18]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[19]  Douglas A. Reynolds,et al.  Language Recognition via i-vectors and Dimensionality Reduction , 2011, INTERSPEECH.