Binding via Reconstruction Clustering

Disentangled distributed representations of data are desirable for machine learning, since they are more expressive and can generalize from fewer examples. However, for complex data, the distributed representations of multiple objects present in the same input can interfere and lead to ambiguities, which is commonly referred to as the binding problem. We argue for the importance of the binding problem to the field of representation learning, and develop a probabilistic framework that explicitly models inputs as a composition of multiple objects. We propose an unsupervised algorithm that uses denoising autoencoders to dynamically bind features together in multi-object inputs through an Expectation-Maximization-like clustering process. The effectiveness of this method is demonstrated on artificially generated datasets of binary images, showing that it can even generalize to bind together new objects never seen by the autoencoder during training.

[1]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  J. Urgen Schmidhuber,et al.  Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.

[3]  M. Tarr,et al.  Becoming a “Greeble” Expert: Exploring Mechanisms for Face Recognition , 1997, Vision Research.

[4]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[5]  Sven Behnke,et al.  Learning Iterative Image Reconstruction in the Neural Abstraction Pyramid , 2001, Int. J. Comput. Intell. Appl..

[6]  H. B. Barlow,et al.  Finding Minimum Entropy Codes , 1989, Neural Computation.

[7]  R. O’Reilly,et al.  Three forms of binding and their neural substrates: Alternatives to temporal synchrony , 2003 .

[8]  Heiko Wersing Learning Lateral Interactions for Feature Binding and Sensory Segmentation , 2001, NIPS.

[9]  Pascal Vincent,et al.  Generalized Denoising Auto-Encoders as Generative Models , 2013, NIPS.

[10]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[11]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[12]  A. Ravishankar Rao,et al.  Unsupervised Segmentation With Dynamical Units , 2008, IEEE Transactions on Neural Networks.

[13]  Wolf Singer,et al.  Neuronal Synchrony: A Versatile Code for the Definition of Relations? , 1999, Neuron.

[14]  P. Földiák,et al.  Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.

[15]  V. Lollo The feature-binding problem is an ill-posed problem , 2012, Trends in Cognitive Sciences.

[16]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[17]  Jürgen Schmidhuber,et al.  Learning to Generate Artificial Fovea Trajectories for Target Detection , 1991, Int. J. Neural Syst..

[18]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[19]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Thomas Serre,et al.  Neuronal Synchrony in Complex-Valued Deep Networks , 2013, ICLR.

[21]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[22]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[23]  C. Gray The Temporal Correlation Hypothesis of Visual Feature Integration Still Alive and Well , 1999, Neuron.

[24]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[25]  Christoph von der Malsburg,et al.  The Correlation Theory of Brain Function , 1994 .

[26]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27]  Richard S. Busby,et al.  Generalizable Relational Binding from Coarse-coded Distributed Representations , 2001, NIPS.

[28]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[29]  A. Treisman Solutions to the Binding Problem Progress through Controversy and Convergence , 1999, Neuron.

[30]  C. Malsburg Binding in models of perception and brain function , 1995, Current Opinion in Neurobiology.

[31]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[32]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[33]  P. Milner A model for visual shape recognition. , 1974, Psychological review.

[34]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[35]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.