Dense adaptive cascade forest: a self-adaptive deep ensemble for classification problems

Recent researches have shown that deep forest ensemble achieves a considerable increase in classification accuracy compared with the general ensemble learning methods, especially when the training set is small. In this paper, we take advantage of deep forest ensemble and introduce the dense adaptive cascade forest (daForest). Our model has a better performance than the original cascade forest with three major features: First, we apply SAMME.R boosting algorithm to improve the performance of the model. It guarantees the improvement as the number of layers increases. Second, our model connects each layer to the subsequent ones in a feed-forward fashion, which enhances the capability of the model to resist performance degeneration. Third, we add a hyper-parameter optimization layer before the first classification layer, making our model spend less time to set up and find the optimal hyper-parameters. Experimental results show that daForest performs significantly well and, in some cases, even outperforms neural networks and achieves state-of-the-art results.

[1]  Elias Oliveira,et al.  An Evolving System Based on Probabilistic Neural Network , 2010, 2010 Eleventh Brazilian Symposium on Neural Networks.

[2]  Peter Kontschieder,et al.  Deep Neural Decision Forests , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  David Sussillo,et al.  Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural Networks , 2013, Neural Computation.

[4]  Yong Zhou,et al.  An improved efficient rotation forest algorithm to predict the interactions among proteins , 2018, Soft Comput..

[5]  Zhi Liu,et al.  Application of Synergetic Neural Network in Online Writeprint Identification , 2011 .

[6]  Alain Abran,et al.  On the value of parameter tuning in heterogeneous ensembles effort estimation , 2017, Soft Computing.

[7]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[8]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[9]  Elias Oliveira,et al.  Agglomeration and Elimination of Terms for Dimensionality Reduction , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[10]  Kaisheng Yao,et al.  KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[12]  Jun-Hai Zhai,et al.  Fuzzy integral-based ELM ensemble for imbalanced big data classification , 2018, Soft Comput..

[13]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[14]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[15]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[16]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[17]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[18]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[19]  Licheng Jiao,et al.  Medical image denoising based on sparse dictionary learning and cluster ensemble , 2018, Soft Comput..

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[22]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[23]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[24]  A. Saleh,et al.  A New Variables Selection And Dimensionality Reduction Technique Coupled with Simca Method for the Classification of text Documents , 2015 .

[25]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[26]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[27]  Kyaw Kyaw Htike,et al.  Forests of unstable hierarchical clusters for pattern classification , 2018, Soft Comput..

[28]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[29]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[30]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[31]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[34]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[35]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[36]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[38]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[39]  Fei Ye Evolving the SVM model based on a hybrid method using swarm optimization techniques in combination with a genetic algorithm for medical diagnosis , 2016, Multimedia Tools and Applications.

[40]  Mark R. Segal,et al.  Machine Learning Benchmarks and Random Forest Regression , 2004 .

[41]  Peter Kontschieder,et al.  Neural Decision Forests for Semantic Image Labelling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[43]  Ji Feng,et al.  Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.

[44]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[45]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[47]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[48]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[49]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[50]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[51]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[52]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.