Determining Optimal Multi-layer Perceptron Structure Using Linear Regression

This paper presents a novel method to determine the optimal Multi-layer Perceptron structure using Linear Regression. Starting from clustering the dataset used to train a neural network it is possible to define Multiple Linear Regression models to determine the architecture of a neural network. This method work unsupervised unlike other methods and more flexible with different datasets types. The proposed method adapt to the complexity of training datasets to provide the best results regardless of the size and type of dataset. Clustering algorithm used to impose a specific analysis of data used to train the network such us determining the distance measure, normalization and clustering technique suitable with the type of training dataset used.

[1]  Mohammed Bennamoun,et al.  Linear Regression for Face Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Byeong Ho Kang,et al.  Investigation and improvement of multi-layer perception neural networks for credit scoring , 2015, Expert Syst. Appl..

[3]  Dinesh Kumar,et al.  Performance Evaluation of Distance Metrics in the Clustering Algorithms , 2014 .

[4]  Bin Zhang,et al.  Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R , 2008, Bioinform..

[5]  Colin J. Akerman,et al.  Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.

[6]  N. Norwawi,et al.  Improvement on Agglomerative Hierarchical Clustering Algorithm Based on Tree Data Structure with Bidirectional Approach , 2012, 2012 Third International Conference on Intelligent Systems Modelling and Simulation.

[7]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[8]  Francesc Pozo,et al.  Wind Turbine Fault Detection through Principal Component Analysis and Statistical Hypothesis Testing , 2015 .

[9]  Ewout W Steyerberg,et al.  The number of subjects per variable required in linear regression analyses. , 2015, Journal of clinical epidemiology.

[10]  Charu C. Aggarwal,et al.  Data Clustering , 2013 .

[11]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[12]  Pratap Dangeti,et al.  Statistics for Machine Learning , 2017 .

[13]  K. Gnana Sheela,et al.  Review on Methods to Fix Number of Hidden Neurons in Neural Networks , 2013 .

[14]  R. Yoon,et al.  Predicting short-term outcome of primary total hip arthroplasty:a prospective multivariate regression analysis of 12 independent factors. , 2010, The Journal of arthroplasty.

[15]  Athman Bouguettaya,et al.  Efficient agglomerative hierarchical clustering , 2015, Expert Syst. Appl..

[16]  Junfeng Chen,et al.  Affinity Propagation-Based Probability Neural Network Structure Optimization , 2014, 2014 Tenth International Conference on Computational Intelligence and Security.

[17]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[18]  Dong Liang,et al.  Shape recognition and retrieval based on edit distance and dynamic programming , 2009 .

[19]  Kentaro Kinoshita,et al.  Effect of number of input layer units on performance of neural network systems for detection of abnormal areas from X-ray images of chest , 2011, 2011 IEEE 5th International Conference on Cybernetics and Intelligent Systems (CIS).

[20]  J. Faraway Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models , 2005 .

[21]  Miltos Petridis,et al.  On Predicting the Optimal Number of Hidden Nodes , 2015, 2015 International Conference on Computational Science and Computational Intelligence (CSCI).

[22]  M. Ghaedi,et al.  Application of least squares support vector regression and linear multiple regression for modeling removal of methyl orange onto tin oxide nanoparticles loaded on activated carbon and activated carbon prepared from Pistacia atlantica wood. , 2016, Journal of colloid and interface science.

[23]  Michael K. Ng,et al.  On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Karol J. Piczak Environmental sound classification with convolutional neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[25]  Li-Chun Wang,et al.  Optimal Number of Clusters in Dense Wireless Sensor Networks: A Cross-Layer Approach , 2009, IEEE Trans. Veh. Technol..

[26]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Rajeev Tripathi,et al.  Optimal number of clusters in wireless sensor networks: An FCM approach , 2010 .

[28]  S. Chatterjee,et al.  Regression Analysis by Example , 1979 .

[29]  Sujoy Ghose,et al.  Growing nonuniform feedforward networks for continuous mappings , 1996, Neurocomputing.

[30]  Shideh Shams Amiri,et al.  Using multiple regression analysis to develop energy consumption indicators for commercial buildings in the U.S. , 2015 .

[31]  S. Wongwises,et al.  Thermal conductivity of Cu/TiO2–water/EG hybrid nanofluid: Experimental data and modeling using artificial neural network and correlation☆ , 2015 .

[32]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[33]  Narasimhan Sundararajan,et al.  A two stage learning algorithm for a Growing-Pruning Spiking Neural Network for pattern classification problems , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).