Neural Networks Trained on Natural Scenes Exhibit Gestalt Closure

The Gestalt laws of perceptual organization, which describe how visual elements in an image are grouped and interpreted, have traditionally been thought of as innate. Given past research showing that these laws have ecological validity, we investigate whether deep learning methods infer Gestalt laws from the statistics of natural scenes. We examine the law of closure, which asserts that human visual perception tends to “close the gap” by assembling elements that can jointly be interpreted as a complete figure or object. We demonstrate that a state-of-the-art convolutional neural network, trained to classify natural images, exhibits closure on synthetic displays of edge fragments, as assessed by similarity of internal representations. This finding provides further support for the hypothesis that the human perceptual system is even more elegant than the Gestaltists imagined: a single law—adaptation to the statistical structure of the environment—might suffice as fundamental.

[1]  Samy Bengio,et al.  Identity Crisis: Memorization and Generalization under Extreme Overparameterization , 2019, ICLR.

[2]  B. Gibson,et al.  Must Figure-Ground Organization Precede Object Recognition? An Assumption in Peril , 1994 .

[3]  M. Tinker A Visual Motor Gestalt Test and its Clinical Use. , 1940 .

[4]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Morten H. Christiansen,et al.  Statistical learning research: A critical review and possible new directions. , 2019, Psychological bulletin.

[6]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[7]  S. Palmer,et al.  A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization. , 2012, Psychological bulletin.

[8]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[9]  R Kimchi,et al.  The Role of Wholistic/Configural Properties versus Global Properties in Visual Form Perception , 1994, Perception.

[10]  C. K. Ogden A Source Book Of Gestalt Psychology , 2013 .

[11]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  A. Kramer,et al.  Perceptual organization and focused attention: The role of objects and proximity in visual processing , 1991, Perception & psychophysics.

[13]  Michael C. Mozer,et al.  Perception of multiple objects - a connectionist approach , 1991, Neural network modeling and connectionism.

[14]  L. Bender A VISUAL MOTOR GESTALT TEST AND ITS CLINICAL USE , 1940 .

[15]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[16]  Michael C. Mozer,et al.  Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning , 2018, NeurIPS.

[17]  Jean-Michel Morel,et al.  From Gestalt Theory to Image Analysis: A Probabilistic Approach , 2007 .

[18]  Duane Schultz,et al.  A History of Modern Psychology , 1969 .

[19]  C. Gilbert,et al.  On a common circle: natural scenes and Gestalt rules. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Nikolaus Kriegeskorte,et al.  Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition , 2017, bioRxiv.

[21]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[22]  P. Kellman,et al.  A common mechanism for illusory and occluded object completion. , 1998, Journal of experimental psychology. Human perception and performance.

[23]  Mary A. Peterson,et al.  Past experience and meaning affect object detection: A hierarchical Bayesian approach , 2019, Psychology of Learning and Motivation.

[24]  Ana B. Chica,et al.  Attentional Routes to Conscious Perception , 2012, Front. Psychology.

[25]  Lawrence C. Sager,et al.  Perception of wholes and of their component parts: some configural superiority effects. , 1977, Journal of experimental psychology. Human perception and performance.

[26]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[27]  Joseph L. Sanguinetti,et al.  Increased alpha band activity indexes inhibitory competition across a border during figure assignment , 2014, Vision Research.

[28]  Michael A. Pitts,et al.  Visual Processing of Contour Patterns under Conditions of Inattentional Blindness , 2012, Journal of Cognitive Neuroscience.

[29]  R. von der Heydt,et al.  Illusory contours and cortical neuron responses. , 1984, Science.

[30]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[32]  Jeffrey S. Perry,et al.  Edge co-occurrence in natural images predicts contour grouping performance , 2001, Vision Research.

[33]  J. Duncan Selective attention and the organization of visual information , 1984 .

[34]  Alexander Borst,et al.  How does Nature Program Neuron Types? , 2008, Front. Neurosci..

[35]  R. Kimchi,et al.  Perceptual organization, visual attention, and objecthood , 2016, Vision Research.

[36]  G. Lupyan Linguistically Modulated Perception and Cognition: The Label-Feedback Hypothesis , 2012, Front. Psychology.

[37]  B. Anderson Filling-in models of completion: rejoinder to Kellman, Garrigan, Shipley, and Keane (2007) and Albert (2007). , 2007, Psychological review.

[38]  D. Holmes Search for "closure" in a visually perceived pattern. , 1968, Psychological bulletin.

[39]  S. Grossberg How visual illusions illuminate complementary brain processes: illusory depth from brightness and apparent motion of illusory contours , 2014, Front. Hum. Neurosci..

[40]  Mariella Dimiccoli,et al.  A Computational Model for Amodal Completion , 2015, Journal of Mathematical Imaging and Vision.

[41]  R. Zemel,et al.  Experience-Dependent Perceptual Grouping and Object-Based Attention , 2002 .

[42]  W. Wundt,et al.  Grundzüge der physiologischen psyhcologie , 1893 .

[43]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[44]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[45]  Norbert Krüger,et al.  Collinearity and Parallelism are Statistically Significant Second-Order Relations of Complex Cell Responses , 1998, Neural Processing Letters.

[46]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[47]  R. Behrens Art, Design and Gestalt Theory , 2017 .

[48]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[49]  Charless C. Fowlkes,et al.  Natural-Scene Statistics Predict How the Figure–Ground Cue of Convexity Affects Human Depth Perception , 2010, The Journal of Neuroscience.

[50]  E. Brunswik,et al.  Ecological cue-validity of proximity and of other Gestalt factors. , 1953, The American journal of psychology.

[51]  M. Brodeur,et al.  The effect of interpolation and perceptual difficulty on the visual potentials evoked by illusory figures , 2006, Brain Research.

[52]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[53]  G. Westheimer Gestalt Theory Reconfigured: Max Wertheimer's Anticipation of Recent Developments in Visual Neuroscience , 1999, Perception.

[54]  R. Kimchi Primacy of wholistic processing and global/local paradigm: a critical review. , 1992, Psychological bulletin.

[55]  J. Elder,et al.  Ecological statistics of Gestalt laws for the perceptual organization of contours. , 2002, Journal of vision.

[56]  R. Shapley,et al.  Spatial and Temporal Properties of Illusory Contours and Amodal Boundary Completion , 1996, Vision Research.

[57]  Bolei Zhou,et al.  Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Dejan Todorovic,et al.  Gestalt principles , 2008, Scholarpedia.

[59]  W. McD.,et al.  Grundzüge der physiologischen Psychologie Principles of Physiological Psychology , 1905, Nature.

[60]  F. Jäkel,et al.  An overview of quantitative approaches in Gestalt perception , 2016, Vision Research.

[61]  Sebastian Stabinger,et al.  Evaluating CNNs on the Gestalt Principle of Closure , 2019, ICANN.

[62]  J. Duncan Selective attention and the organization of visual information. , 1984, Journal of experimental psychology. General.

[63]  J. Zinker Creative process in Gestalt therapy , 1977 .

[64]  Roger Ratcliff,et al.  The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks , 2008, Neural Computation.

[65]  Carlo A. Marzi,et al.  Gestalt Perceptual Organization of Visual Stimuli Captures Attention Automatically: Electrophysiological Evidence , 2016, Front. Hum. Neurosci..

[66]  Ronald A. Rensink,et al.  Early completion of occluded objects , 1998, Vision Research.

[67]  Elias B. Kosmatopoulos,et al.  Understanding Deep Convolutional Networks through Gestalt Theory , 2018, 2018 IEEE International Conference on Imaging Systems and Techniques (IST).

[68]  P. Bennett,et al.  Deriving behavioural receptive fields for visually completed contours , 2000, Current Biology.

[69]  Philip J. Kellman,et al.  Deep Convolutional Networks do not Perceive Illusory Contours , 2018, CogSci.

[70]  James Elder,et al.  The effect of contour closure on the rapid discrimination of two-dimensional shapes , 1993, Vision Research.

[71]  Philip J. Kellman,et al.  A unified model of illusory and occluded contour interpolation , 2010, Vision Research.

[72]  J. R. Pomerantz,et al.  A century of Gestalt psychology in visual perception: II. Conceptual and theoretical foundations. , 2012, Psychological bulletin.

[73]  Ronald,et al.  Learning representations by backpropagating errors , 2004 .