Do Neural Networks for Segmentation Understand Insideness?

The insideness problem is an image segmentation modality that consists of determining which pixels are inside and outside a region. Deep Neural Networks (DNNs) excel in segmentation benchmarks, but it is unclear that they have the ability to solve the insideness problem as it requires evaluating longrange spatial dependencies. In this paper, the insideness problem is analysed in isolation, without texture or semantic cues, such that other aspects of segmentation do not interfere in the analysis. We demonstrate that DNNs for segmentation with few units have sufficient complexity to solve insideness for any curve. Yet, such DNNs have severe problems to learn general solutions. Only recurrent networks trained with small images learn solutions that generalize well to almost any curve. Recurrent networks can decompose the evaluation of long-range dependencies into a sequence of local operations, and learning with small images alleviates the common difficulties of training recurrent networks with a large number of unrolling steps. This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. Do Neural Networks for Segmentation Understand Insideness? Kimberly Villalobos∗,1, Vilim Štih∗,1,2, Amineh Ahmadinejad∗,1, Shobhita Sundaram, Jamell Dozier, Andrew Francl, Frederico Azevedo, Tomotake Sasaki†,3,1, Xavier Boix†,1,• ∗ and † indicate equal contribution 1 Center for Brains, Minds and Machines, MIT (USA) 2 Max Planck Institute of Neurobiology (Germany) 3 Fujitsu Laboratories Ltd. (Japan) • Correspondence to xboix@mit.edu

[1]  Xiaojuan Qi,et al.  Referring Image Segmentation via Recurrent Refinement Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[3]  Yoshua Bengio,et al.  ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Xi Zhang,et al.  Cognitive Deficit of Deep Learning in Numerosity , 2018, AAAI.

[5]  N. Sloane The on-line encyclopedia of integer sequences , 2018, Notices of the American Mathematical Society.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[8]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[9]  Azriel Rosenfeld,et al.  Connectivity in Digital Pictures , 1970, JACM.

[10]  Yassine Ruichek,et al.  Survey on semantic segmentation using deep learning techniques , 2019, Neurocomputing.

[11]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[12]  Jin Akiyama,et al.  Discrete and Computational Geometry and Graphs , 2013, Lecture Notes in Computer Science.

[13]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[14]  Thomas Serre,et al.  Disentangling neural mechanisms for perceptual grouping , 2019, ICLR.

[15]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[16]  Vijayan K. Asari,et al.  Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation , 2018, ArXiv.

[17]  Douglas Heaven,et al.  Why deep-learning AIs are so easy to fool , 2019, Nature.

[18]  Danique Jeurissen,et al.  Serial grouping of 2D-image regions with object-based attention in humans , 2016, eLife.

[19]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[20]  Frank Harary,et al.  Graph Theory , 2016 .

[21]  Ohad Shamir,et al.  Failures of Gradient-Based Deep Learning , 2017, ICML.

[22]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[23]  Yi Li,et al.  Fully Convolutional Instance-Aware Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Luc Van Gool,et al.  Deep Extreme Cut: From Extreme Points to Object Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[26]  George Papandreou,et al.  MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Thomas Serre,et al.  Not-So-CLEVR: learning same–different relations strains feedforward neural networks , 2018, Interface Focus.

[28]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[29]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Eric Haines,et al.  Point in Polygon Strategies , 1994, Graphics Gems.

[31]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[32]  Trevor Darrell,et al.  Learning to Segment Every Thing , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Thomas Serre,et al.  Learning long-range spatial dependencies with horizontal gated-recurrent units , 2018, NeurIPS.

[35]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[36]  Jitendra Malik,et al.  Iterative Instance Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Heesoo Myeong,et al.  SeedNet: Automatic Seed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Marcus Z. Comiter,et al.  Attacking Artificial Intelligence AI ’ s Security Vulnerability and What Policymakers Can Do About It , 2019 .

[39]  Longin Jan Latecki,et al.  Digital Topology , 1994 .

[40]  S. Ullman Visual routines , 1984, Cognition.

[41]  Jason Yosinski,et al.  An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution , 2018, NeurIPS.

[42]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[44]  S. Ullman High-Level Vision: Object Recognition and Visual Cognition , 1996 .

[45]  Nathan Srebro,et al.  The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..

[46]  Alex Graves,et al.  Memory-Efficient Backpropagation Through Time , 2016, NIPS.

[47]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[48]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.