Representation Learning: A Review and New Perspectives
暂无分享,去创建一个
[1] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .
[2] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .
[3] D. Hubel,et al. Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.
[4] J. Besag. Statistical Analysis of Non-Lattice Data , 1975 .
[5] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[6] J. Friedman,et al. Projection Pursuit Regression , 1981 .
[7] Kunihiko Fukushima,et al. Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position , 1982, Pattern Recognit..
[8] Y. L. Cun. Learning Process in an Asymmetric Threshold Network , 1986 .
[9] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .
[10] Yann LeCun,et al. Learning processes in an asymmetric threshold network , 1986 .
[11] Johan Håstad,et al. Almost optimal lower bounds for small depth circuits , 1986, STOC '86.
[12] Yann LeCun,et al. Generalization and network design strategies , 1989 .
[13] Geoffrey E. Hinton,et al. Learning distributed representations of concepts. , 1989 .
[14] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[15] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..
[16] Christian Jutten,et al. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..
[17] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.
[18] Yann LeCun,et al. Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.
[19] Geoffrey E. Hinton,et al. Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.
[20] T. Poggio,et al. Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries , 1992 .
[21] Yann LeCun,et al. Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.
[22] Radford M. Neal. Connectionist Learning of Belief Networks , 1992, Artif. Intell..
[23] Geoffrey E. Hinton,et al. Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.
[24] Yoshua Bengio. A Connectionist Approach to Speech Recognition , 1993, Int. J. Pattern Recognit. Artif. Intell..
[25] Geoffrey E. Hinton,et al. Learning Mixture Models of Spatial Coherence , 1993, Neural Computation.
[26] Rich Caruana,et al. Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.
[27] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..
[28] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[29] Henry S. Baird,et al. Document image defect models , 1995 .
[30] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.
[31] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.
[32] Sam T. Roweis,et al. EM Algorithms for PCA and Sensible PCA , 1997, NIPS 1997.
[33] H. Sebastian Seung,et al. Learning Continuous Attractors in Recurrent Networks , 1997, NIPS.
[34] Terrence J. Sejnowski,et al. The “independent components” of natural scenes are edge filters , 1997, Vision Research.
[35] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[36] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.
[37] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[38] Alessandro Sperduti,et al. A general framework for adaptive processing of data structures , 1998, IEEE Trans. Neural Networks.
[39] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.
[40] L. Younes. On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates , 1999 .
[41] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[42] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .
[43] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.
[44] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[45] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.
[46] Aapo Hyvärinen,et al. Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.
[47] Aapo Hyvärinen,et al. Topographic Independent Component Analysis , 2001, Neural Computation.
[48] Erkki Oja,et al. Independent Component Analysis , 2001 .
[49] Geoffrey E. Hinton,et al. Learning Sparse Topographic Representations with Products of Student-t Distributions , 2002, NIPS.
[50] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.
[51] Aapo Hyvärinen,et al. Temporal Coherence, Natural Image Sequences, and the Visual Cortex , 2002, NIPS.
[52] Pascal Vincent,et al. Manifold Parzen Windows , 2002, NIPS.
[53] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[54] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[55] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.
[56] Matthew Brand,et al. Charting a Manifold , 2002, NIPS.
[57] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[58] D. Donoho,et al. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.
[59] Nicolas Le Roux,et al. Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.
[60] Mikhail Belkin,et al. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.
[61] D. Donoho,et al. Hessian Eigenmaps : new locally linear embedding techniques for high-dimensional data , 2003 .
[62] Konrad Paul Kording,et al. How are complex cell properties adapted to the statistics of natural stimuli? , 2004, Journal of neurophysiology.
[63] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[64] Alan L. Yuille,et al. The Convergence of Contrastive Divergences , 2004, NIPS.
[65] Kilian Q. Weinberger,et al. Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[66] Yoshua Bengio,et al. Non-Local Manifold Tangent Learning , 2004, NIPS.
[67] H. Bourlard,et al. Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.
[68] Lawrence Cayton,et al. Algorithms for manifold learning , 2005 .
[69] Laurenz Wiskott,et al. Slow feature analysis yields a rich repertoire of complex cell properties. , 2005, Journal of vision.
[70] Johan Håstad,et al. On the power of small-depth threshold circuits , 1991, computational complexity.
[71] Aapo Hyvärinen,et al. Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..
[72] Pascal Vincent,et al. Non-Local Manifold Parzen Windows , 2005, NIPS.
[73] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.
[74] Miguel Á. Carreira-Perpiñán,et al. On Contrastive Divergence Learning , 2005, AISTATS.
[75] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[76] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[77] Daniel Marcu,et al. Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..
[78] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[79] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[80] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.
[81] Yee Whye Teh,et al. Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation , 2006, Cogn. Sci..
[82] Max Welling Donald,et al. Products of Experts , 2007 .
[83] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[84] Roger B. Grosse,et al. Shift-Invariance Sparse Coding for Audio Classification , 2007, UAI.
[85] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[86] Bruno A. Olshausen,et al. Learning Horizontal Connections in a Sparse Coding Model of Natural Images , 2007, NIPS.
[87] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.
[88] Nicolas Le Roux,et al. Learning the 2-D Topology of Images , 2007, NIPS.
[89] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[90] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .
[91] Thomas Serre,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[92] Rajat Raina,et al. Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.
[93] Aapo Hyvärinen,et al. Some extensions of score matching , 2007, Comput. Stat. Data Anal..
[94] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[95] Geoffrey E. Hinton,et al. The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.
[96] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[97] Ruslan Salakhutdinov,et al. Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.
[98] Geoffrey E. Hinton,et al. Generative versus discriminative training of RBMs for classification of fMRI images , 2008, NIPS.
[99] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[100] Bruno A. Olshausen,et al. Learning Transformational Invariants from Natural Movies , 2008, NIPS.
[101] Tijmen Tieleman,et al. Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.
[102] Jason Weston,et al. Deep learning via semi-supervised embedding , 2008, ICML '08.
[103] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[104] Yoshua Bengio,et al. Neural net language models , 2008, Scholarpedia.
[105] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[106] H. Sebastian Seung,et al. Natural Image Denoising with Convolutional Networks , 2008, NIPS.
[107] David M. Bradley,et al. Differentiable Sparse Coding , 2008, NIPS.
[108] Aapo Hyvärinen,et al. Optimal Approximation of Signal Priors , 2008, Neural Computation.
[109] Geoffrey E. Hinton,et al. Using fast weights to improve persistent contrastive divergence , 2009, ICML '09.
[110] Yoshua Bengio,et al. Slow, Decorrelated Features for Pretraining Complex Cell-like Networks , 2009, NIPS.
[111] Yoshua Bengio,et al. Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..
[112] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[113] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[114] Quoc V. Le,et al. Measuring Invariances in Deep Networks , 2009, NIPS.
[115] Geoffrey E. Hinton,et al. Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.
[116] Aapo Hyvärinen,et al. Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.
[117] R. Fergus,et al. Learning invariant features through topographic filter maps , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[118] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[119] Max Welling,et al. Herding Dynamic Weights for Partially Observed Random Field Models , 2009, UAI.
[120] Pascal Vincent,et al. Deep Learning using Robust Interdependent Codes , 2009, AISTATS.
[121] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.
[122] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[123] Yihong Gong,et al. Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.
[124] Yoshua Bengio,et al. Justifying and Generalizing Contrastive Divergence , 2009, Neural Computation.
[125] Laurens van der Maaten,et al. Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.
[126] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[127] Ruslan Salakhutdinov,et al. Learning in Markov Random Fields using Tempered Transitions , 2009, NIPS.
[128] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..
[129] Hossein Mobahi,et al. Deep learning from temporal coherence in video , 2009, ICML '09.
[130] Hugo Larochelle,et al. Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.
[131] Aaron C. Courville,et al. Understanding Representations Learned in Deep Architectures , 2010 .
[132] Quoc V. Le,et al. Tiled convolutional neural networks , 2010, NIPS.
[133] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[134] Dong Yu,et al. Sequential Labeling Using Deep-Structured Conditional Random Fields , 2010, IEEE Journal of Selected Topics in Signal Processing.
[135] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.
[136] Yann LeCun,et al. Emergence of Complex-Like Cells in a Temporal Product Network with Local Receptive Fields , 2010, ArXiv.
[137] Jean Ponce,et al. A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.
[138] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[139] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.
[140] J. Andrew Bagnell,et al. Boosted Backpropagation Learning for Training Deep Modular Networks , 2010, ICML.
[141] Ilya Sutskever,et al. On the Convergence Properties of Contrastive Divergence , 2010, AISTATS.
[142] Ruslan Salakhutdinov,et al. Learning Deep Boltzmann Machines using Adaptive MCMC , 2010, ICML.
[143] Yann LeCun,et al. Regularized estimation of image statistics by Score Matching , 2010, NIPS.
[144] Tapani Raiko,et al. Parallel tempering is efficient for learning restricted Boltzmann machines , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[145] Geoffrey E. Hinton,et al. Generating more realistic images using gated MRF's , 2010, NIPS.
[146] Kevin Swersky,et al. Inductive Principles for Learning Restricted Boltzmann Machines , 2010 .
[147] Pascal Vincent,et al. Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines , 2010, AISTATS.
[148] Graham W. Taylor,et al. Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[149] Y-Lan Boureau,et al. Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.
[150] Geoffrey E. Hinton,et al. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.
[151] Yoshua Bengio,et al. DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS , 2010, Comput. Intell..
[152] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[153] Tong Zhang,et al. Improved Local Coordinate Coding using Local Tangents , 2010, ICML.
[154] Jason Weston,et al. Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.
[155] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[156] Hariharan Narayanan,et al. Sample Complexity of Testing the Manifold Hypothesis , 2010, NIPS.
[157] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[158] A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10 , 2010 .
[159] Nando de Freitas,et al. Inductive Principles for Restricted Boltzmann Machine Learning , 2010, AISTATS.
[160] Marc'Aurelio Ranzato,et al. Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.
[161] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[162] Shenghuo Zhu,et al. Deep Coding Network , 2010, NIPS.
[163] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[164] Geoffrey E. Hinton,et al. Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[165] Geoffrey E. Hinton,et al. Binary coding of speech spectrograms using a deep auto-encoder , 2010, INTERSPEECH.
[166] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[167] Geoffrey E. Hinton,et al. Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images , 2010, AISTATS.
[168] Yann LeCun,et al. Learning Fast Approximations of Sparse Coding , 2010, ICML.
[169] Joseph F. Murray,et al. Convolutional Networks Can Learn to Generate Affinity Graphs for Image Segmentation , 2010, Neural Computation.
[170] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.
[171] Nando de Freitas,et al. Asymptotic Efficiency of Deterministic Estimators for Discrete Energy-Based Models: Ratio Matching and Pseudolikelihood , 2011, UAI.
[172] Yann LeCun,et al. Structured sparse coding via lateral inhibition , 2011, NIPS.
[173] Veselin Stoyanov,et al. Empirical Risk Minimization of Graphical Model Parameters Given Approximate Inference, Decoding, and Model Structure , 2011, AISTATS.
[174] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[175] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[176] Radford M. Neal. Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .
[177] Andrew Y. Ng,et al. Selecting Receptive Fields in Deep Networks , 2011, NIPS.
[178] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[179] Tapani Raiko,et al. Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines , 2011, ICML.
[180] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.
[181] Pascal Vincent,et al. Higher Order Contractive Auto-Encoder , 2011, ECML/PKDD.
[182] Yoshua Bengio,et al. On Tracking The Partition Function , 2011, NIPS.
[183] Nando de Freitas,et al. On Autoencoders and Score Matching for Energy Based Models , 2011, ICML.
[184] Yann LeCun,et al. Unsupervised Learning of Sparse Features for Scalable Audio Classification , 2011, ISMIR.
[185] Quoc V. Le,et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.
[186] Quoc V. Le,et al. ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.
[187] Mohamed Chtourou,et al. On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.
[188] Yoshua Bengio,et al. Unsupervised Models of Images by Spikeand-Slab RBMs , 2011, ICML.
[189] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.
[190] Julien Mairal,et al. Structured sparsity through convex optimization , 2011, ArXiv.
[191] Katherine A. Heller,et al. Bayesian and L1 Approaches to Sparse Unsupervised Learning , 2011, ICML 2012.
[192] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[193] Yoshua Bengio,et al. Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.
[194] Andrew Y. Ng,et al. The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.
[195] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[196] Lukás Burget,et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.
[197] Yoshua Bengio,et al. A Spike and Slab Restricted Boltzmann Machine , 2011, AISTATS.
[198] Geoffrey E. Hinton,et al. Transforming Autoencoders , 2011 .
[199] Jörg Lücke,et al. A Closed-Form EM Algorithm for Sparse Coding , 2011 .
[200] Stéphane Mallat,et al. Classification with scattering operators , 2010, CVPR 2011.
[201] Pascal Vincent,et al. A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.
[202] Pascal Vincent,et al. The Manifold Tangent Classifier , 2011, NIPS.
[203] Yoshua Bengio,et al. On the Expressive Power of Deep Architectures , 2011, ALT.
[204] Francis R. Bach,et al. Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..
[205] Nicolas Le Roux,et al. Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.
[206] Yoshua Bengio,et al. Large-Scale Learning of Embeddings with Reconstruction Sampling , 2011, ICML.
[207] Jeffrey Pennington,et al. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.
[208] Jiquan Ngiam,et al. Learning Deep Energy Models , 2011, ICML.
[209] Berin Martini,et al. Large-Scale FPGA-based Convolutional Networks , 2011 .
[210] John D. Lafferty,et al. Learning image representations from the pixel level via hierarchical sparse coding , 2011, CVPR 2011.
[211] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[212] Miguel Lázaro-Gredilla,et al. Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning , 2011, NIPS.
[213] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.
[214] Pascal Vincent,et al. Quickly Generating Representative Samples from an RBM-Derived Process , 2011, Neural Computation.
[215] Will Y. Zou. Unsupervised learning of visual invariance with temporal coherence , 2011 .
[216] Douglas Eck,et al. Temporal Pooling and Multiscale Learning for Automatic Annotation and Ranking of Music Audio , 2011, ISMIR.
[217] Geoffrey E. Hinton,et al. On deep generative models with applications to recognition , 2011, CVPR 2011.
[218] Rémi Gribonval,et al. Should Penalized Least Squares Regression be Interpreted as Maximum A Posteriori Estimation? , 2011, IEEE Transactions on Signal Processing.
[219] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..
[220] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.
[221] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[222] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[223] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[224] James J. DiCarlo,et al. How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.
[225] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.
[226] F. Savard. Réseaux de neurones à relaxation entraînés par critère d'autoencodeur débruitant , 2012 .
[227] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[228] Yoshua Bengio,et al. On Training Deep Boltzmann Machines , 2012, ArXiv.
[229] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[230] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[231] Yoshua Bengio,et al. Unsupervised and Transfer Learning Challenge: a Deep Learning Approach , 2011, ICML Unsupervised and Transfer Learning.
[232] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[233] Jörg Lücke,et al. Closed-Form EM for Sparse Coding and Its Application to Source Separation , 2011, LVA/ICA.
[234] Kilian Q. Weinberger,et al. Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.
[235] Yoshua Bengio,et al. Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery , 2012, ArXiv.
[236] Yoshua Bengio,et al. A Generative Process for sampling Contractive Auto-Encoders , 2012, ICML 2012.
[237] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[238] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.
[239] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[240] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[241] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[242] Holger Schwenk,et al. Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.
[243] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[244] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.
[245] Yoshua Bengio,et al. Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders , 2012, ArXiv.
[246] Jason Weston,et al. Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.
[247] Christopher K. I. Williams,et al. Multiple Texture Boltzmann Machines , 2012, AISTATS.
[248] Alexandre Allauzen,et al. Structured Output Layer Neural Network Language Models for Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[249] Razvan Pascanu,et al. Natural Gradient Revisited , 2013, ICLR.
[250] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.
[251] Yoshua Bengio,et al. Better Mixing via Deep Representations , 2012, ICML.
[252] Geoffrey E. Hinton,et al. Training Recurrent Neural Networks , 2013 .
[253] Yoshua Bengio,et al. What regularized auto-encoders learn from the data-generating distribution , 2012, J. Mach. Learn. Res..
[254] M. Yüksel,et al. A Ph.D. Thesis , 2014 .
[255] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.
[256] Jason Morton,et al. When Does a Mixture of Products Contain a Product of Mixtures? , 2012, SIAM J. Discret. Math..