CODA: Constructivism Learning for Instance-Dependent Dropout Architecture Construction

Dropout is attracting intensive research interest in deep learning as an efficient approach to prevent overfitting. Recently incorporating “structural” information when deciding which units to drop out produced promising results comparing to methods that ignore the structural information. However, a major issue of the existing work is that it failed to differentiate among instances when constructing the dropout architecture. This can be a significant deficiency for many applications. To solve this issue, we propose Constructivism learning for instance-dependent Dropout Architecture (CODA), which is inspired from a philosophical theory, constructivism learning. Specially, based on the theory we have designed a better drop out technique, Uniform Process Mixture Models, using a Bayesian nonparametric method Uniform process. We have evaluated our proposed method on 5 real-world datasets and compared the performance with other state-of-the-art dropout techniques. The experimental results demonstrated the effectiveness of CODA.

[1]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[2]  Christian Wolf,et al.  ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[4]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[5]  David B. Dunson,et al.  Improving prediction from dirichlet process mixtures via enrichment , 2014, J. Mach. Learn. Res..

[6]  Kenta Oku,et al.  Context-Aware SVM for Context-Dependent Information Recommendation , 2006, 7th International Conference on Mobile Data Management (MDM'06).

[7]  Ariel D. Procaccia,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[8]  Dahua Lin,et al.  Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation , 2013, NIPS.

[9]  Christian Wolf,et al.  Modout: Learning Multi-Modal Architectures by Stochastic Regularization , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[10]  Carlo S. Regazzoni,et al.  Online Nonparametric Bayesian Activity Mining and Analysis From Surveillance Video , 2016, IEEE Transactions on Image Processing.

[11]  Katherine A. Heller,et al.  An Alternative Prior Process for Nonparametric Bayesian Clustering , 2008, AISTATS.

[12]  Zhen Li,et al.  Blockout: Dynamic Model Selection for Hierarchical Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[14]  Shane T. Jensen,et al.  Bayesian Clustering of Transcription Factor Binding Motifs , 2006, math/0610655.

[15]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[16]  Christopher D. Manning,et al.  Fast dropout training , 2013, ICML.

[17]  Dmitry P. Vetrov,et al.  Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[18]  Brendan J. Frey,et al.  Adaptive dropout for training deep neural networks , 2013, NIPS.

[19]  R. Venkatesh Babu,et al.  Generalized Dropout , 2016, ArXiv.

[20]  Marco Locatelli,et al.  Convergence and first hitting time of simulated annealing algorithms for continuous global optimization , 2001, Math. Methods Oper. Res..

[21]  Advait Sarkar,et al.  Constructivist Design for Interactive Machine Learning , 2016, CHI Extended Abstracts.

[22]  Rafael Pérez y Pérez,et al.  Emergence of eye–hand coordination as a creative process in an artificial developmental agent , 2017, Adapt. Behav..

[23]  Jonathan Tompson,et al.  Efficient object localization using Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Shin-ichi Maeda,et al.  A Bayesian encourages dropout , 2014, ArXiv.

[25]  Tianbao Yang,et al.  Improved Dropout for Shallow and Deep Learning , 2016, NIPS.

[26]  Warren B. Powell,et al.  Dirichlet Process Mixtures of Generalized Linear Models , 2009, J. Mach. Learn. Res..

[27]  Jun Huan,et al.  Constructivism Learning: A Learning Paradigm for Transparent Predictive Analytics , 2017, KDD.

[28]  J. Piaget,et al.  The equilibration of cognitive structures : the central problem of intellectual development , 1985 .