A Novel Approach in Determining Neural Networks Architecture to Classify Data With Large Number of Attributes

One of the challenges in the successful implementation of deep neural networks (DNN) lies on the determination of its architecture, in terms of the number of hidden layers and neurons for each hidden layer. In this research, a new approach is proposed to determine the neural networks architecture especially in the form of Multi-layer Perceptron (MLP) which will later be used as a machine learning method to classify data with large number of attribute. The new approach is proposed since the previous approaches are no longer applicable as general guidelines to determine the architecture of neural networks. Thus, the proposed approach aims to determine the number of hidden layers by using principal component analysis (PCA), while the number of neurons for each hidden layer is determined by using K-Means clustering. The determined neural network architecture is utilized to classify data with large number of attribute, such as the Gas Sensor Array Drift dataset which has 128 input attributes and six output classes and the Parkinson’s Disease Classification dataset which has 754 output attributes and two output classes. The results indicate that the best-performing architecture for the first dataset is the one that uses one hidden layer, with a PCA cumulative variance of 69.7%, while for the second dataset is the one that uses three hidden layers, with a PCA cumulative variance of 38.9%. Increasing the number of hidden layers does not always improve the performance of neural networks. Therefore, it is essential to determine the number of hidden layers and neurons that are appropriate to achieve good performance in neural networks. The use of PCA and K-Means clustering is expected to provide guidelines in determining neural networks architectures with good performance.

[1]  Anurag Bhardwaj,et al.  Deep Learning Essentials: Your hands-on guide to the fundamentals of deep learning and neural network modeling , 2018 .

[2]  Majdi Maabreh,et al.  Parameters optimization of deep learning models using Particle swarm optimization , 2017, 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC).

[3]  Leslie Pack Kaelbling,et al.  Generalization in Deep Learning , 2017, ArXiv.

[4]  Namig J. Guliyev,et al.  Approximation capability of two hidden layer feedforward neural networks with fixed weights , 2018, Neurocomputing.

[5]  Stefan Holban,et al.  Determining neural network architecture using data mining techniques , 2018, 2018 International Conference on Development and Application Systems (DAS).

[6]  Jamal Ahmad Dargham,et al.  Finding the number of hidden neurons for an MLP neural network using coarse to fine search technique , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[7]  José Ranilla,et al.  Hyper-parameter selection in deep neural networks using parallel particle swarm optimization , 2017, GECCO.

[8]  Miltiadis Petridis,et al.  On the Optimal Node Ratio between Hidden Layers: A Probabilistic Study , 2016 .

[9]  João Mendes Moreira,et al.  A General Introduction to Data Analytics , 2018 .

[10]  Yixian Yang,et al.  Bounds on the number of hidden neurons in three-layer binary neural networks , 2003, Neural Networks.

[11]  S. N. Deepa,et al.  A novel criterion to select hidden neuron numbers in improved back propagation networks for wind speed forecasting , 2016, Applied Intelligence.

[12]  Charu C. Aggarwal,et al.  Neural Networks and Deep Learning , 2018, Springer International Publishing.

[13]  Kridanto Surendro,et al.  Determining the Neural Network Topology: A Review , 2019, ICSCA.

[14]  Hyeonjoon Moon,et al.  Background Information of Deep Learning for Structural Engineering , 2017 .

[15]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[16]  Miltos Petridis,et al.  Two Hidden Layers are Usually Better than One , 2017, EANN.

[17]  W. Marsden I and J , 2012 .

[18]  Stefan Holban,et al.  Determining optimal neural network architecture using regression methods , 2018, 2018 International Conference on Development and Application Systems (DAS).

[19]  Qing-song Zuo,et al.  An artificial neural network developed for predicting of performance and emissions of a spark ignition engine fueled with butanol–gasoline blends , 2018 .

[20]  Taghi M. Khoshgoftaar,et al.  Survey on categorical data for neural networks , 2020, Journal of Big Data.

[21]  S. N. Deepa,et al.  Comparative analysis on hidden neurons estimation in multi layer perceptron neural networks for wind speed forecasting , 2017, Artificial Intelligence Review.

[22]  Aysegul Gunduz,et al.  A comparative analysis of speech signal processing algorithms for Parkinson's disease classification and the use of the tunable Q-factor wavelet transform , 2019, Appl. Soft Comput..

[23]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[24]  Masahiko Arai,et al.  Bounds on the number of hidden units in binary-valued three-layer neural networks , 1993, Neural Networks.

[25]  José Ranilla,et al.  Particle swarm optimization for hyper-parameter selection in deep neural networks , 2017, GECCO.

[26]  Namig J. Guliyev,et al.  On the Approximation by Single Hidden Layer Feed-forward Neural Networks With Fixed Weights , 2017, Neural Networks.

[27]  Rachid Guerraoui,et al.  Deep Learning Works in Practice. But Does it Work in Theory? , 2018, ArXiv.

[28]  K G Sheela,et al.  Selection of number of hidden neurons in neural networks in renewable energy systems , 2014 .

[29]  Alexios Koutsoukas,et al.  Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data , 2017, Journal of Cheminformatics.

[30]  Shankar Vembu,et al.  Chemical gas sensor drift compensation using classifier ensembles , 2012 .

[31]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[32]  Kazuyuki Murase,et al.  A new algorithm to design compact two-hidden-layer artificial neural networks , 2001, Neural Networks.

[33]  Josh Patterson,et al.  Deep Learning: A Practitioner's Approach , 2017 .

[34]  Timothy Masters,et al.  Practical neural network recipes in C , 1993 .

[35]  Tohru Nitta,et al.  Resolution of Singularities Introduced by Hierarchical Structure in Deep Neural Networks , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[37]  Shin'ichi Tamura,et al.  Capabilities of a four-layered feedforward neural network: four layers versus three , 1997, IEEE Trans. Neural Networks.