暂无分享,去创建一个
[1] J. Laurie Snell,et al. I. The Ising model , 1980 .
[2] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[3] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[4] Geoffrey E. Hinton,et al. Learning representations by back-propagation errors, nature , 1986 .
[5] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[6] Geoffrey E. Hinton,et al. Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.
[7] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[8] Geoffrey E. Hinton,et al. The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.
[9] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[10] Michael I. Jordan,et al. Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.
[11] Teuvo Kohonen,et al. Emergence of invariant-feature detectors in the adaptive-subspace self-organizing map , 1996, Biological Cybernetics.
[12] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.
[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[14] H. Sebastian Seung,et al. Learning Continuous Attractors in Recurrent Networks , 1997, NIPS.
[15] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[16] Brendan J. Frey,et al. Graphical Models for Machine Learning and Digital Communication , 1998 .
[17] Geoffrey E. Hinton. Products of experts , 1999 .
[18] Rosenbaum,et al. Quantum annealing of a disordered magnet , 1999, Science.
[19] Samy Bengio,et al. Taking on the curse of dimensionality in joint distributions using neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..
[20] David Maxwell Chickering,et al. Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..
[21] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[22] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.
[23] D. Heckerman,et al. Dependency networks for inference , 2000 .
[24] Aapo Hyvärinen,et al. Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.
[25] Sven Behnke,et al. Learning Iterative Image Reconstruction in the Neural Abstraction Pyramid , 2001, Int. J. Comput. Intell. Appl..
[26] Yukito Iba. EXTENDED ENSEMBLE MONTE CARLO , 2001 .
[27] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[28] Samy Bengio,et al. Scaling Large Learning Problems with Hard Parallel Mixtures , 2002, SVM.
[29] Lawrence Cayton,et al. Algorithms for manifold learning , 2005 .
[30] Aapo Hyvärinen,et al. Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..
[31] Pascal Vincent,et al. Non-Local Manifold Parzen Windows , 2005, NIPS.
[32] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[33] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..
[34] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[35] Geoffrey E. Hinton,et al. Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.
[36] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[37] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[38] Yoshua Bengio,et al. Nonlocal Estimation of Manifold Structure , 2006, Neural Computation.
[39] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[40] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.
[41] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.
[42] Herbert Jaeger,et al. Echo state network , 2007, Scholarpedia.
[43] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[44] Roger B. Grosse,et al. Shift-Invariance Sparse Coding for Audio Classification , 2007, UAI.
[45] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[46] Fernando Pereira,et al. Structured Learning with Approximate Inference , 2007, NIPS.
[47] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[48] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.
[49] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[50] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[51] Rajat Raina,et al. Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.
[52] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[53] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[54] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[55] Yoshua Bengio,et al. Neural net language models , 2008, Scholarpedia.
[56] Elisa Ricci,et al. Large Margin Methods for Structured Output Prediction , 2008, Computational Intelligence Paradigms.
[57] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[58] Geoffrey E. Hinton,et al. A Scalable Hierarchical Distributed Language Model , 2008, NIPS.
[59] David M. Bradley,et al. Differentiable Sparse Coding , 2008, NIPS.
[60] Yoshua Bengio,et al. Slow, Decorrelated Features for Pretraining Complex Cell-like Networks , 2009, NIPS.
[61] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[62] Quoc V. Le,et al. Measuring Invariances in Deep Networks , 2009, NIPS.
[63] Geoffrey E. Hinton,et al. Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.
[64] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.
[65] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[66] Ruslan Salakhutdinov,et al. Learning in Markov Random Fields using Tempered Transitions , 2009, NIPS.
[67] Guillermo Sapiro,et al. Online dictionary learning for sparse coding , 2009, ICML '09.
[68] A. Hyvärinen,et al. Estimation of Non-normalized Statistical Models , 2009 .
[69] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[70] Hugo Larochelle,et al. Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.
[71] Ilya Sutskever,et al. Parallelizable Sampling of Markov Random Fields , 2010, AISTATS.
[72] Dong Yu,et al. Sequential Labeling Using Deep-Structured Conditional Random Fields , 2010, IEEE Journal of Selected Topics in Signal Processing.
[73] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[74] Ruslan Salakhutdinov,et al. Learning Deep Boltzmann Machines using Adaptive MCMC , 2010, ICML.
[75] Razvan Pascanu,et al. Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.
[76] Yann LeCun,et al. Regularized estimation of image statistics by Score Matching , 2010, NIPS.
[77] Tapani Raiko,et al. Parallel tempering is efficient for learning restricted Boltzmann machines , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[78] Pascal Vincent,et al. Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines , 2010, AISTATS.
[79] Geoffrey E. Hinton,et al. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.
[80] Dong Yu,et al. Language recognition using deep-structured conditional random fields , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[81] Yoshua Bengio,et al. DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS , 2010, Comput. Intell..
[82] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[83] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[84] Hariharan Narayanan,et al. Sample Complexity of Testing the Manifold Hypothesis , 2010, NIPS.
[85] Marc'Aurelio Ranzato,et al. Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.
[86] Geoffrey E. Hinton,et al. Binary coding of speech spectrograms using a deep auto-encoder , 2010, INTERSPEECH.
[87] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[88] Yann LeCun,et al. Learning Fast Approximations of Sparse Coding , 2010, ICML.
[89] Veselin Stoyanov,et al. Empirical Risk Minimization of Graphical Model Parameters Given Approximate Inference, Decoding, and Model Structure , 2011, AISTATS.
[90] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[91] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.
[92] Pascal Vincent,et al. Higher Order Contractive Auto-Encoder , 2011, ECML/PKDD.
[93] Nando de Freitas,et al. On Autoencoders and Score Matching for Energy Based Models , 2011, ICML.
[94] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.
[95] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[96] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[97] Mohamed Chtourou,et al. On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.
[98] Yoshua Bengio,et al. Unsupervised Models of Images by Spikeand-Slab RBMs , 2011, ICML.
[99] Julien Mairal,et al. Structured sparsity through convex optimization , 2011, ArXiv.
[100] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[101] Yoshua Bengio,et al. Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.
[102] Andrew Y. Ng,et al. The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.
[103] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[104] Pascal Vincent,et al. A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.
[105] Pascal Vincent,et al. The Manifold Tangent Classifier , 2011, NIPS.
[106] Francis R. Bach,et al. Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..
[107] Geoffrey E. Hinton,et al. Conditional Restricted Boltzmann Machines for Structured Output Prediction , 2011, UAI.
[108] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[109] John D. Lafferty,et al. Learning image representations from the pixel level via hierarchical sparse coding , 2011, CVPR 2011.
[110] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[111] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[112] Pascal Vincent,et al. Quickly Generating Representative Samples from an RBM-Derived Process , 2011, Neural Computation.
[113] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[114] Razvan Pascanu,et al. Learning Algorithms for the Classification Restricted Boltzmann Machine , 2012, J. Mach. Learn. Res..
[115] Yoshua Bengio,et al. Large-Scale Feature Learning With Spike-and-Slab Sparse Coding , 2012, ICML.
[116] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.
[117] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[118] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[119] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[120] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[121] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.
[122] Razvan Pascanu,et al. Theano: Deep Learning on GPUs with Python , 2012 .
[123] F. Savard. Réseaux de neurones à relaxation entraînés par critère d'autoencodeur débruitant , 2012 .
[124] Michael G. Rabbat,et al. Communication/Computation Tradeoffs in Consensus-Based Distributed Optimization , 2012, NIPS.
[125] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[126] Yoshua Bengio,et al. Unsupervised and Transfer Learning Challenge: a Deep Learning Approach , 2011, ICML Unsupervised and Transfer Learning.
[127] Yoshua Bengio,et al. Disentangling Factors of Variation via Generative Entangling , 2012, ArXiv.
[128] Klaus-Robert Müller,et al. Deep Boltzmann Machines and the Centering Trick , 2012, Neural Networks: Tricks of the Trade.
[129] Yoshua Bengio,et al. Evolving Culture vs Local Minima , 2012, ArXiv.
[130] Nicol N. Schraudolph,et al. Centering Neural Network Gradient Factors , 1996, Neural Networks: Tricks of the Trade.
[131] Andrew Y. Ng,et al. Emergence of Object-Selective Features in Unsupervised Feature Learning , 2012, NIPS.
[132] David Barber,et al. Bayesian reasoning and machine learning , 2012 .
[133] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .
[134] Yoshua Bengio,et al. Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery , 2012, ArXiv.
[135] Yoshua Bengio,et al. A Generative Process for sampling Contractive Auto-Encoders , 2012, ICML 2012.
[136] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[137] Yoshua Bengio,et al. Joint Training of Deep Boltzmann Machines , 2012, ArXiv.
[138] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[139] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[140] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[141] Hossein Mobahi,et al. Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.
[142] Yoshua Bengio,et al. Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders , 2012, ArXiv.
[143] Pascal Vincent,et al. Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.
[144] Pascal Vincent,et al. Disentangling Factors of Variation for Facial Expression Recognition , 2012, ECCV.
[145] Rob Fergus,et al. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.
[146] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[147] Yoshua Bengio,et al. Big Neural Networks Waste Capacity , 2013, ICLR.
[148] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons , 2013, ArXiv.
[149] Geoffrey Zweig,et al. Recent advances in deep learning for speech research at Microsoft , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[150] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[151] Yoshua Bengio,et al. Joint Training Deep Boltzmann Machines for Classification , 2013, ICLR.
[152] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[153] Honglak Lee,et al. Learning and Selecting Features Jointly with Point-wise Gated Boltzmann Machines , 2013, ICML.
[154] Richard S. Zemel,et al. Exploring Compositional High Order Pattern Potentials for Structured Output Learning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[155] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[156] Yoshua Bengio,et al. Better Mixing via Deep Representations , 2012, ICML.
[157] Pascal Vincent,et al. Generalized Denoising Auto-Encoders as Generative Models , 2013, NIPS.
[158] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[159] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[160] Jason Weston,et al. A semantic matching energy function for learning with multi-relational data , 2013, Machine Learning.
[161] Yoshua Bengio. J un 2 01 3 Deep Learning of Representations : Looking Forward , 2013 .
[162] Geoffrey E. Hinton,et al. Training Recurrent Neural Networks , 2013 .
[163] Yoshua Bengio,et al. Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions , 2012, AISTATS.
[164] Yoshua Bengio,et al. What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..
[165] Yoshua Bengio,et al. Deep Generative Stochastic Networks Trainable by Backprop , 2013, ICML.
[166] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.