Random-Forest-Inspired Neural Networks

Neural networks have become very popular in recent years, because of the astonishing success of deep learning in various domains such as image and speech recognition. In many of these domains, specific architectures of neural networks, such as convolutional networks, seem to fit the particular structure of the problem domain very well and can therefore perform in an astonishingly effective way. However, the success of neural networks is not universal across all domains. Indeed, for learning problems without any special structure, or in cases where the data are somewhat limited, neural networks are known not to perform well with respect to traditional machine-learning methods such as random forests. In this article, we show that a carefully designed neural network with random forest structure can have better generalization ability. In fact, this architecture is more powerful than random forests, because the back-propagation algorithm reduces to a more powerful and generalized way of constructing a decision tree. Furthermore, the approach is efficient to train and requires a small constant factor of the number of training examples. This efficiency allows the training of multiple neural networks to improve the generalization accuracy. Experimental results on real-world benchmark datasets demonstrate the effectiveness of the proposed enhancements for classification and regression.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Peter Kontschieder,et al.  Deep Neural Decision Forests , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[5]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[6]  Yanjun Qi,et al.  Random Forest Similarity for Protein-Protein Interaction Prediction from Multiple Sources , 2004, Pacific Symposium on Biocomputing.

[7]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[11]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[13]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[14]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  D. Hosmer,et al.  Logistic Regression, Conditional , 2005 .

[18]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[19]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[20]  Erwan Scornet,et al.  Neural Random Forests , 2016, Sankhya A.

[21]  Huan Liu,et al.  Deep learning for feature representation , 2018 .

[22]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[23]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[24]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[25]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[26]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[27]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[28]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[29]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[30]  Charu C. Aggarwal,et al.  Data Mining: The Textbook , 2015 .

[31]  Ji Feng,et al.  Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.

[32]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[33]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[34]  Quan Pan,et al.  A Generative Model for category text generation , 2018, Inf. Sci..

[35]  Huan Liu,et al.  What Your Images Reveal: Exploiting Visual Contents for Point-of-Interest Recommendation , 2017, WWW.

[36]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Baoxin Li,et al.  Weakly Supervised Facial Attribute Manipulation via Deep Adversarial Network , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[39]  Max Kuhn,et al.  The caret Package , 2007 .

[40]  Charu C. Aggarwal,et al.  Using a Random Forest to Inspire a Neural Network and Improving on It , 2017, SDM.

[41]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[42]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[43]  Wayne Ieee,et al.  Entropy Nets: From Decision Trees to Neural Networks , 1990 .