Knowledge Matters: Importance of Prior Information for Optimization
暂无分享,去创建一个
[1] R. Lewontin. ‘The Selfish Gene’ , 1977, Nature.
[2] László Dezsö,et al. Universal Grammar , 1981, Certainty in Action.
[3] S A Roe,et al. The generative process. , 1983, Science.
[4] M. Henneberg. Decrease of human skull size in the Holocene. , 1988, Human biology.
[5] R. Solomonoff. A SYSTEM FOR INCREMENTAL LEARNING BASED ON ALGORITHMIC PROBABILITY , 1989 .
[6] Ray J. Solomonofi,et al. A SYSTEM FOR INCREMENTAL LEARNING BASED ON ALGORITHMIC PROBABILITY , 1989 .
[7] Sebastian Thrun,et al. Explanation-Based Neural Network Learning for Robot Control , 1992, NIPS.
[8] M. Henneberg,et al. Trends in cranial capacity and cranial index in Subsaharan Africa during the Holocene , 1993, American journal of human biology : the official journal of the Human Biology Council.
[9] Jude W. Shavlik,et al. Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..
[10] Joseph O'Sullivan. Integrating Initialization Bias and Search Bias in Neural Network Learning , 1996 .
[11] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[12] B. E. Eckbo,et al. Appendix , 1826, Epilepsy Research.
[13] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[14] John Langford,et al. CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.
[15] J. Henrich,et al. The evolution of cultural evolution , 2003 .
[16] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[17] G. Peterson. A day of great illumination: B. F. Skinner's discovery of shaping. , 2004, Journal of the experimental analysis of behavior.
[18] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[19] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[20] Tom M. Mitchell,et al. The Need for Biases in Learning Generalizations , 2007 .
[21] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[22] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .
[23] Jason Weston,et al. Deep learning via semi-supervised embedding , 2008, ICML '08.
[24] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..
[25] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[26] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.
[27] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[28] Kai A. Krueger,et al. Flexible shaping: How learning in small steps helps , 2009, Cognition.
[29] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[30] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[31] Jason Weston,et al. A user's guide to support vector machines. , 2010, Methods in molecular biology.
[32] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[33] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[34] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[35] Gautam Kunapuli,et al. The Adviceptron: Giving Advice to the Perceptron , 2010 .
[36] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[37] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[38] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[39] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[40] Ting Li,et al. Comparing machines and humans on a visual categorization test , 2011, Proceedings of the National Academy of Sciences.
[41] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[42] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[43] Bilge Mutlu,et al. How Do Humans Teach: On Curriculum Learning and Teaching Dimension , 2011, NIPS.
[44] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[45] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[46] Yoshua Bengio,et al. Evolving Culture vs Local Minima , 2012, ArXiv.
[47] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[48] Yoshua Bengio,et al. A Generative Process for sampling Contractive Auto-Encoders , 2012, ICML 2012.
[49] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[50] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[51] Pascal Vincent,et al. Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.
[52] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[53] Yoshua Bengio,et al. Big Neural Networks Waste Capacity , 2013, ICLR.
[54] Razvan Pascanu,et al. Learned-norm pooling for deep neural networks , 2013, ArXiv.
[55] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[56] Tapani Raiko,et al. Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities , 2013, ICLR.
[57] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[58] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[59] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[60] Peter Kulchyski. and , 2015 .