论文信息 - RNADE: The real-valued neural autoregressive density-estimator

RNADE: The real-valued neural autoregressive density-estimator

We introduce RNADE, a new model for joint density estimation of real-valued vectors. Our model calculates the density of a datapoint as the product of one-dimensional conditionals modeled using mixture density networks with shared parameters. RNADE learns a distributed representation of the data, while having a tractable expression for the calculation of densities. A tractable likelihood allows direct comparison with other methods and training by standard gradient-based optimizers. We compare the performance of RNADE on several datasets of heterogeneous and perceptual data, finding it outperforms mixture models in all but one case.

[1] T. Cacoullos. Estimation of a multivariate density , 1966 .

[2] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[3] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[4] T. Robinson. Simple Lossless and Near-lossless Waveform Compression , 1994 .

[5] S. Srihari. Mixture Density Networks , 1994 .

[6] Brendan J. Frey,et al. Does the Wake-sleep Algorithm Produce Good Density Estimators? , 1995, NIPS.

[7] Geoffrey E. Hinton,et al. The EM algorithm for mixtures of factor analyzers , 1996 .

[8] Brendan J. Frey,et al. Graphical Models for Machine Learning and Digital Communication , 1998 .

[9] Samy Bengio,et al. Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks , 1999, NIPS.

[10] Nir Friedman,et al. Gaussian Process Networks , 2000, UAI.

[11] Geoffrey J. McLachlan,et al. Mixtures of Factor Analyzers , 2000, International Conference on Machine Learning.

[12] Jitendra Malik,et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13] Ruslan Salakhutdinov,et al. On the quantitative analysis of deep belief networks , 2008, ICML '08.

[14] Ruslan Salakhutdinov,et al. Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.

[15] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[16] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[17] Geoffrey E. Hinton,et al. Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[19] Yair Weiss,et al. From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[20] Yoshua Bengio. Discussion of "The Neural Autoregressive Distribution Estimator" , 2011, AISTATS.

[21] Yee Whye Teh,et al. Mixed Cumulative Distribution Networks , 2010, AISTATS.

[22] Yoshua Bengio,et al. A Spike and Slab Restricted Boltzmann Machine , 2011, AISTATS.

[23] Matthias Bethge,et al. In All Likelihood, Deep Belief Is Not Enough , 2010, J. Mach. Learn. Res..

[24] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[25] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[26] Yair Weiss,et al. "Natural Images, Gaussian Mixtures and Dead Leaves" , 2012, NIPS.

[27] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[28] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[29] M. Bethge,et al. Mixtures of Conditional Gaussian Scale Mixtures Applied to Multiscale Image Representations , 2011, PloS one.

[30] Hugo Larochelle,et al. A Neural Autoregressive Topic Model , 2012, NIPS.

[31] Geoffrey E. Hinton,et al. Deep Mixtures of Factor Analysers , 2012, ICML.

[32] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.