论文信息 - Deep Architectures for Baby AI - 字舞流文

Deep Architectures for Baby AI

Motivation: understanding intelligence, building AI, scaling to large scale learning of complex functions

Yoshua Bengio | Yoshua Bengio

[1] Nicolas Le Roux,et al. Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[2] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[3] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[4] R. Zemel. A minimum description length framework for unsupervised learning , 1994 .

[5] Jason Weston,et al. Large-scale kernel machines , 2007 .

[6] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[7] Leo Breiman,et al. Classification and Regression Trees , 1984 .

[8] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[9] Mikhail Belkin,et al. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[10] Yair Weiss,et al. Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[11] Geoffrey E. Hinton,et al. Extracting distributed representations of concepts and relations from positive and negative propositions , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[12] Geoffrey E. Hinton,et al. Self Supervised Boosting , 2002, NIPS.

[13] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[14] Geoffrey E. Hinton,et al. The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[15] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[17] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[18] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[19] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[20] Yee Whye Teh,et al. Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation , 2006, Cogn. Sci..

[21] Yoshua Bengio,et al. Non-Local Manifold Tangent Learning , 2004, NIPS.

[22] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .

[23] Thomas G. Dietterich,et al. Advances in Neural Information Processing Systems 13, Papers from Neural Information Processing Systems (NIPS) 2000, Denver, CO, USA , 2001, NIPS.

[24] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[25] Geoffrey E. Hinton,et al. Learning distributed representations of concepts. , 1989 .

[26] J. Håstad. Computational limitations of small-depth circuits , 1987 .

[27] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[28] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.

[29] Thomas G. Dietterich,et al. Editors. Advances in Neural Information Processing Systems , 2002 .

[30] Lawrence K. Saul,et al. Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[31] Geoffrey E. Hinton,et al. Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[32] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[33] Pascal Vincent,et al. Non-Local Manifold Parzen Windows , 2005, NIPS.

[34] Brian Hazlehurst,et al. How to invent a lexicon: the development of shared symbols in interaction , 2006 .

[35] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[36] P. L. Adams. THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[37] Eric Allender,et al. Circuit Complexity before the Dawn of the New Millennium , 1996, FSTTCS.

[38] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[39] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[40] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[41] Mahesan Niranjan,et al. Neural networks and radial basis functions in classifying static speech patterns , 1990 .

[42] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .

[43] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[44] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[45] R. Guillery. Is postnatal neocortical maturation hierarchical? , 2005, Trends in Neurosciences.

[46] David J. Spiegelhalter,et al. Advances in Neural Information Processing Systems 15 (NIPS 2002) , 2002 .

[47] Garrison W. Cottrell,et al. Non-Linear Dimensionality Reduction , 1992, NIPS.

[48] Paul E. Utgoff,et al. Many-Layered Learning , 2002, Neural Computation.