论文信息 - On the challenge of learning complex functions. - 字舞流文

On the challenge of learning complex functions.

Yoshua Bengio | Yoshua Bengio

[1] Sunita Sarawagi. Learning with Graphical Models , 2008 .

[2] Jason Weston,et al. Large-scale kernel machines , 2007 .

[3] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[4] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .

[5] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[6] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[7] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[8] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[9] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.

[10] Nicolas Le Roux,et al. Convex Neural Networks , 2005, NIPS.

[11] R. Guillery. Is postnatal neocortical maturation hierarchical? , 2005, Trends in Neurosciences.

[12] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[13] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[14] Ronald,et al. Learning representations by backpropagating errors , 2004 .

[15] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.

[16] Michael Schmitt,et al. Descartes' Rule of Signs for Radial Basis Function Neural Networks , 2002, Neural Computation.

[17] Paul E. Utgoff,et al. Many-Layered Learning , 2002, Neural Computation.

[18] B. Schölkopf,et al. Advances in kernel methods: support vector learning , 1999 .

[19] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[20] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21] Eric Allender,et al. Circuit Complexity before the Dawn of the New Millennium , 1996, FSTTCS.

[22] David H. Wolpert,et al. The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[23] San Cristóbal Mateo,et al. The Lack of A Priori Distinctions Between Learning Algorithms , 1996 .

[24] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[25] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[26] Kenji Fukumizu,et al. Active Learning in Multilayer Perceptrons , 1995, NIPS.

[27] Gadi Pinkas,et al. Improving Connectionist Energy Minimization , 1995, J. Artif. Intell. Res..

[28] Yoshua Bengio,et al. Diffusion of Context and Credit Information in Markovian Models , 1995, J. Artif. Intell. Res..

[29] David A. Cohn,et al. Active Learning with Statistical Models , 1996, NIPS.

[30] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[31] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[32] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[33] J. Håstad. Computational limitations of small-depth circuits , 1987 .

[34] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[35] Geoffrey E. Hinton,et al. Learning representations by back-propagation errors, nature , 1986 .

[36] Miklós Ajtai,et al. ∑11-Formulae on finite structures , 1983, Ann. Pure Appl. Log..

[37] P. L. Adams. THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[38] David G. Stork,et al. Pattern Classification , 1973 .