Regularizing Neural Networks via Retaining Confident Connections
暂无分享,去创建一个
Dawei Song | Benyou Wang | Yuexian Hou | Shengnan Zhang | D. Song | Yuexian Hou | Benyou Wang | Shengnan Zhang
[1] Dawei Song,et al. Extending the Extreme Physical Information to Universal Cognitive Models via a Confident Information First Principle , 2014, Entropy.
[2] W. Wong,et al. The calculation of posterior distributions by data augmentation , 1987 .
[3] H. Bozdogan. Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .
[4] Shun-ichi Amari,et al. Information geometry on hierarchy of probability distributions , 2001, IEEE Trans. Inf. Theory.
[5] Donald F. Specht,et al. Probabilistic neural networks , 1990, Neural Networks.
[6] R. Kass. The Geometry of Asymptotic Inference , 1989 .
[7] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[9] Shun-ichi Amari,et al. Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.
[10] Henry P. Wynn,et al. Algebraic and geometric methods in statistics , 2009 .
[11] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[12] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[13] C. R. Rao,et al. Information and the Accuracy Attainable in the Estimation of Statistical Parameters , 1992 .
[14] Dawei Song,et al. Mining pure high-order word associations via information geometry for information retrieval , 2013, TOIS.
[15] Jonathan Tompson,et al. Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.
[16] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[17] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[18] Lukás Burget,et al. Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[19] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[20] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[21] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[22] H. Akaike. A new look at the statistical model identification , 1974 .
[23] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[24] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[25] Shun-ichi Amari,et al. Information-Geometric Measure for Neural Spikes , 2002, Neural Computation.
[26] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[27] N. Čencov. Statistical Decision Rules and Optimal Inference , 2000 .
[28] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Wenjie Li,et al. A Confident Information First Principle for Parameter Reduction and Model Selection of Boltzmann Machines , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[30] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.