Supervised data transformation and dimensionality reduction with a 3-layer multi-layer perceptron for classification problems

The aim of data transformation is to transform the original feature space of data into another space with better properties. This is typically combined with dimensionality reduction, so that the dimensionality of the transformed space is smaller. A widely used method for data transformation and dimensionality reduction is Principal Component Analysis (PCA). PCA finds a subspace that explains most of the data variance. While the new PCA feature space has interesting properties, such as removing linear correlation, PCA is an unsupervised method. Therefore, there is no guarantee that the PCA feature space will be the most appropriate for supervised tasks, such as classification or regression. On the other hand, 3-layer Multi Layer Perceptrons (MLP), which are supervised methods, can also be understood as a data transformation carried out by the hidden layer, followed by a classification/regression operation performed by the output layer. Given that the hidden layer is obtained after a supervised training process, it can be considered that it is performing a supervised data transformation. And if the number of hidden neurons is smaller than the input, also dimensionality reduction. Despite this kind of transformation being widely available (any neural network package that allows access to the hidden layer weights can be used), no extensive experimentation on the quality of 3-layer MLP data transformation has been carried out. The aim of this article is to carry out this research for classification problems. Results show that, overall, this transformation offers better results than the PCA unsupervised transformation method.

[1]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[2]  Muhammad Younus Javed,et al.  Multi-Model Deep Neural Network based Features Extraction and Optimal Selection Approach for Skin Lesion Classification , 2019, 2019 International Conference on Computer and Information Sciences (ICCIS).

[3]  Chen Chen,et al.  Deep Learning and Superpixel Feature Extraction Based on Contractive Autoencoder for Change Detection in SAR Images , 2018, IEEE Transactions on Industrial Informatics.

[4]  Zhaolei Zhang,et al.  A Deep Non-linear Feature Mapping for Large-Margin kNN Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[5]  Valery Naranjo,et al.  EvoDeep: A new evolutionary approach for automatic Deep Neural Networks parametrisation , 2018, J. Parallel Distributed Comput..

[6]  Mourad Zaied,et al.  Unsupervised Features Extraction Using a Multi-view Self Organizing Map for Image Classification , 2017, 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA).

[7]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[8]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[10]  Aki Vehtari,et al.  Features and Metric from a Classifier Improve Visualizations with Dimension Reduction , 2009, ICANN.

[11]  Mohamed Mejri,et al.  RandomForestMLP: An Ensemble-Based Multi-Layer Perceptron Against Curse of Dimensionality , 2020, ArXiv.

[12]  Zhaolei Zhang,et al.  Deep Supervised t-Distributed Embedding , 2010, ICML.

[13]  Ricardo Aler,et al.  Supervised Data Transformation by Means of Neural Network Hidden Layer [R package nntrf version 0.1.3] , 2020 .

[14]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[15]  Mohsen Guizani,et al.  Deep Feature Learning for Medical Image Analysis with Convolutional Autoencoder Neural Network , 2017, IEEE Transactions on Big Data.

[16]  Hermann Ney,et al.  Feature Extraction with Convolutional Neural Networks for Handwritten Word Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[17]  David Camacho,et al.  Cloud Type Identification Using Data Fusion and Ensemble Learning , 2020, IDEAL.

[18]  José María Valls,et al.  Optimizing Linear and Quadratic Data Transformations for Classification Tasks , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[19]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[20]  Pedro Antonio Gutiérrez,et al.  Optimising Convolutional Neural Networks using a Hybrid Statistically-driven Coral Reef Optimisation algorithm , 2020, Appl. Soft Comput..

[21]  Xiaolei Yang,et al.  Research on Feature Extraction of Tumor Image Based on Convolutional Neural Network , 2019, IEEE Access.

[22]  José María Valls,et al.  Evolving linear transformations with a rotation-angles/scaling representation , 2012, Expert Syst. Appl..

[23]  Laurens van der Maaten,et al.  Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[24]  Mourad Zaied,et al.  A Convolutional Deep Self-Organizing Map Feature extraction for machine learning , 2020, Multimedia Tools and Applications.

[25]  Muhammad Ihsan Jambak,et al.  Dimension Reduction with Extraction Methods (Principal Component Analysis - Self Organizing Map - Isometric Mapping) in Indonesian Language Text Documents Clustering , 2021 .

[26]  Xiuping Jia,et al.  Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Witali Aswolinskiy Learning in the Model Space of Neural Networks , 2018 .