Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation

We describe a way of modeling high-dimensional data vectors by using an unsupervised, nonlinear, multilayer neural network in which the activity of each neuron-like unit makes an additive contribution to a global energy score that indicates how surprised the network is by the data vector. The connection weights that determine how the activity of each unit depends on the activities in earlier layers are learned by minimizing the energy assigned to data vectors that are actually observed and maximizing the energy assigned to "confabulations" that are generated by perturbing an observed data vector in a direction that decreases its energy under the current model.

[1]  G. Miller,et al.  Cognitive science. , 1981, Science.

[2]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[3]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[4]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[5]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[6]  Radford M. Neal,et al.  Bayesian Learning for Neural Networks (Lecture Notes in Statistical Vol. 118) , 1997 .

[7]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[8]  J. V. van Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[9]  J. H. Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998 .

[10]  Yee Whye Teh,et al.  Discovering Multiple Constraints that are Frequently Approximately Satisfied , 2001, UAI.

[11]  Aapo Hyvärinen,et al.  A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images , 2001, Vision Research.

[12]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[13]  Simon Osindero,et al.  Modelling the Statistics of Natural Images with Topographic Product of Student-t Models , 2004 .

[14]  Geoffrey E. Hinton,et al.  Topographic Product Models Applied to Natural Scene Statistics , 2006, Neural Computation.