The Self-Organizing Restricted Boltzmann Machine for Deep Representation with the Application on Classification Problems

Abstract Recently, deep learning is proliferating in the field of representation learning. A deep belief network (DBN) consists of a deep network architecture that can generate multiple features of input patterns, using restricted Boltzmann machines (RBMs) as a building block of DBN. A deep learning model can achieve extremely high accuracy in many applications that depend on the model structure. However, specifying various parameters of deep network architecture like the number of hidden layers and neurons is a difficult task even for expert designers. Besides, the number of hidden layers and neurons is typically set manually, while this method is costly in terms of time and computational cost, especially in big data. In this paper, we introduce an approach to determine the number of hidden layers and neurons of the deep network automatically during the learning process. To this end, the input vector is transformed from the feature space with a low dimension into the new feature space with a high dimension in a hidden layer of RBM. In the following, new features are ranked according to their discrimination power between classes in the new space, using the Separability-correlation measure for feature importance ranking algorithm. The algorithm uses the mean of weights as a threshold, so the neurons whose weights exceed the threshold are retained, and the others are removed in the hidden layer. The number of retained neurons is presented as a reasonable number of neurons. The number of layers is also determined in the deep model, using the validation data. The proposed approach acts as a regularization method since the neurons whose weights are lower than the threshold are removed; thus, RBM learns to copy input merely approximate. It also prevents over-fitting with a suitable number of hidden layers and neurons. Eventually, DBN can determine its structure according to the input data and is the self-organizing model. The experimental results on benchmark datasets confirm the proposed method.

[1]  S LewMichael,et al.  Deep learning for visual understanding , 2016 .

[2]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[3]  Lipo Wang,et al.  Data Mining With Computational Intelligence , 2006, IEEE Transactions on Neural Networks.

[4]  Ali Khaki Sedigh,et al.  Stable Learning Algorithm Approaches for ANFIS As an Identifier , 2008 .

[5]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[6]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[7]  Geoffrey E. Hinton Learning multiple layers of representation , 2007, Trends in Cognitive Sciences.

[8]  Mohammad Mehdi Homayounpour,et al.  Deep Belief Network Training Improvement Using Elite Samples Minimizing Free Energy , 2015, Int. J. Pattern Recognit. Artif. Intell..

[9]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[10]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[11]  Ayman M. Eldeib,et al.  Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..

[12]  Bruce H. Edwards,et al.  Elementary linear algebra , 1988 .

[13]  Hugo Larochelle,et al.  An Infinite Restricted Boltzmann Machine , 2015, Neural Computation.

[14]  F Alibakhshi,et al.  Designing stable neural identifier based on Lyapunov method , 2015 .

[15]  Lipo Wang,et al.  Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[16]  Akira Hara,et al.  Knowledge Discovery and Data Mining in Medicine , 2005 .

[17]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[18]  Daniel S. Yeung,et al.  Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure , 2006, Neurocomputing.

[19]  Liangjun Chen,et al.  Generalized Correntropy based deep learning in presence of non-Gaussian noises , 2018, Neurocomputing.

[20]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[21]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  Michael G. Strintzis,et al.  ECG analysis using nonlinear PCA neural networks for ischemia detection , 1998, IEEE Trans. Signal Process..

[24]  Geoffrey E. Hinton,et al.  Robust Boltzmann Machines for recognition and denoising , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  G. Strang Introduction to Linear Algebra , 1993 .

[26]  Michael S. Lew,et al.  Deep learning for visual understanding: A review , 2016, Neurocomputing.

[27]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[28]  Geoffrey E. Hinton,et al.  A Better Way to Pretrain Deep Boltzmann Machines , 2012, NIPS.