How Do Neural Networks Estimate Optical Flow? A Neuropsychology-Inspired Study

End-to-end trained convolutional neural networks have led to a breakthrough in optical flow estimation. The most recent advances focus on improving the optical flow estimation by improving the architecture and setting a new benchmark on the publicly available MPI-Sintel dataset. Instead, in this article, we investigate how deep neural networks estimate optical flow. A better understanding of how these networks function is important for (i) assessing their generalization capabilities to unseen inputs, and (ii) suggesting changes to improve their performance. For our investigation, we focus on FlowNetS, as it is the prototype of an encoder-decoder neural network for optical flow estimation. Furthermore, we use a filter identification method that has played a major role in uncovering the motion filters present in animal brains in neuropsychological research. The method shows that the filters in the deepest layer of FlowNetS are sensitive to a variety of motion patterns. Not only do we find translation filters, as demonstrated in animal brains, but thanks to the easier measurements in artificial neural networks, we even unveil dilation, rotation, and occlusion filters. Furthermore, we find similarities in the refinement part of the network and the perceptual filling-in process which occurs in the mammal primary visual cortex.

[1]  L. Pessoa,et al.  Filling-in: From perceptual completion to cortical reorganization. , 2003 .

[2]  Lior Wolf,et al.  InterpoNet, a Brain Inspired Neural Network for Optical Flow Dense Interpolation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Aamir Saeed Malik,et al.  An evaluation of optical flow algorithms for crowd analytics in surveillance system , 2016, 2016 6th International Conference on Intelligent and Advanced Systems (ICIAS).

[4]  Joachim Weickert,et al.  Universität Des Saarlandes Fachrichtung 6.1 – Mathematik Optic Flow in Harmony Optic Flow in Harmony Optic Flow in Harmony , 2022 .

[5]  Martial Hebert,et al.  Learning to Extract Motion from Videos in Convolutional Neural Networks , 2016, ACCV.

[6]  Jianfeng Feng,et al.  Computational neuroscience , 1986, Behavioral and Brain Sciences.

[7]  L. Palmer,et al.  Receptive-field structure in cat striate cortex. , 1981, Journal of neurophysiology.

[8]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  R. Hetherington The Perception of the Visual World , 1952 .

[10]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[11]  Michael J. Black,et al.  Attacking Optical Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  I. Ohzawa,et al.  Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. II. Linearity of temporal and spatial summation. , 1993, Journal of neurophysiology.

[13]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[14]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[15]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[16]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[17]  Jan Kautz,et al.  Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[19]  H. Komatsu The neural mechanisms of perceptual filling-in , 2006, Nature Reviews Neuroscience.

[20]  A. Borst,et al.  Fly motion vision. , 2010, Annual review of neuroscience.

[21]  Steven S. Beauchemin,et al.  The Frequency Structure of One-Dimensional Occluding Image Signals , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[23]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[24]  D. Regan,et al.  Looming detectors in the human visual pathway , 1978, Vision Research.

[25]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  I. Ohzawa,et al.  Receptive-field dynamics in the central visual pathways , 1995, Trends in Neurosciences.

[27]  J. P. Jones,et al.  The two-dimensional spectral structure of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[28]  Xiaoou Tang,et al.  A Lightweight Optical Flow CNN —Revisiting Data Fidelity and Regularization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[30]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Bruno A. Olshausen,et al.  Learning sparse, overcomplete representations of time-varying natural images , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[32]  MicroAir,et al.  Monocular distance estimation with optical flow maneuvers and efference copies: a stability-based strategy , 2016, Bioinspiration & Biomimetics.

[33]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[35]  A. Borst,et al.  Common circuit design in fly and mammalian motion vision , 2015, Nature Neuroscience.

[36]  Bolei Zhou,et al.  Understanding Intra-Class Knowledge Inside CNN , 2015, ArXiv.

[37]  D. Ruderman,et al.  Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[38]  H. C. Longuet-Higgins,et al.  The interpretation of a moving retinal image , 1980, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[39]  I. Ohzawa,et al.  Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. I. General characteristics and postnatal development. , 1993, Journal of neurophysiology.

[40]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[41]  Ajit Singh,et al.  Optic flow computation : a unified perspective , 1991 .

[42]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[43]  S. Ullman,et al.  The interpretation of visual motion , 1977 .

[44]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Thomas Brox,et al.  Uncertainty Estimates and Multi-hypotheses Networks for Optical Flow , 2018, ECCV.

[46]  P. Anandan,et al.  A computational framework and an algorithm for the measurement of visual motion , 1987, International Journal of Computer Vision.

[47]  J. P. Jones,et al.  The two-dimensional spatial structure of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[48]  H. Neumann,et al.  The Role of Attention in Figure-Ground Segregation in Areas V1 and V4 of the Visual Cortex , 2012, Neuron.

[49]  D. G. Albrecht,et al.  Visual cortical neurons: are bars or gratings the optimal stimuli? , 1980, Science.

[50]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[51]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  David J. Heeger,et al.  Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[53]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[54]  David J. Fleet,et al.  Computation of normal velocity from local phase information , 1989, Proceedings CVPR '89: IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55]  R. L. Valois,et al.  The orientation and direction selectivity of cells in macaque visual cortex , 1982, Vision Research.

[56]  Baoxin Li,et al.  A survey of variational and CNN-based optical flow techniques , 2019, Signal Process. Image Commun..

[57]  RussLL L. Ds Vnlos,et al.  SPATIAL FREQUENCY SELECTIVITY OF CELLS IN MACAQUE VISUAL CORTEX , 2022 .

[58]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Prashant Parikh A Theory of Communication , 2010 .

[60]  Cordelia Schmid,et al.  EpicFlow: Edge-preserving interpolation of correspondences for optical flow , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[62]  N. Petkov,et al.  Motion detection, noise reduction, texture suppression, and contour enhancement by spatiotemporal Gabor filters with surround inhibition , 2007, Biological Cybernetics.