How Do Neural Networks Estimate Optical Flow? A Neuropsychology-Inspired Study

End-to-end trained convolutional neural networks have led to a breakthrough in optical flow estimation. The most recent advances focus on improving the optical flow estimation by improving the architecture and setting a new benchmark on the publicly available MPI-Sintel dataset. Instead, in this article, we investigate how deep neural networks estimate optical flow. A better understanding of how these networks function is important for (i) assessing their generalization capabilities to unseen inputs, and (ii) suggesting changes to improve their performance. For our investigation, we focus on FlowNetS, as it is the prototype of an encoder-decoder neural network for optical flow estimation. Furthermore, we use a filter identification method that has played a major role in uncovering the motion filters present in animal brains in neuropsychological research. The method shows that the filters in the deepest layer of FlowNetS are sensitive to a variety of motion patterns. Not only do we find translation filters, as demonstrated in animal brains, but thanks to the easier measurements in artificial neural networks, we even unveil dilation, rotation, and occlusion filters. Furthermore, we find similarities in the refinement part of the network and the perceptual filling-in process which occurs in the mammal primary visual cortex.

[1]  Martial Hebert,et al.  Learning to Extract Motion from Videos in Convolutional Neural Networks , 2016, ACCV.

[2]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[3]  MicroAir,et al.  Monocular distance estimation with optical flow maneuvers and efference copies: a stability-based strategy , 2016, Bioinspiration & Biomimetics.

[4]  Bruno A. Olshausen,et al.  Learning sparse, overcomplete representations of time-varying natural images , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[5]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Prashant Parikh A Theory of Communication , 2010 .

[7]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[8]  Steven S. Beauchemin,et al.  The Frequency Structure of One-Dimensional Occluding Image Signals , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  J. P. Jones,et al.  The two-dimensional spatial structure of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[10]  Deborah Silver,et al.  Feature Visualization , 1994, Scientific Visualization.

[11]  A. Borst,et al.  Common circuit design in fly and mammalian motion vision , 2015, Nature Neuroscience.

[12]  Jianfeng Feng,et al.  Computational neuroscience , 1986, Behavioral and Brain Sciences.

[13]  I. Ohzawa,et al.  Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. I. General characteristics and postnatal development. , 1993, Journal of neurophysiology.

[14]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[15]  S. Ullman,et al.  The interpretation of visual motion , 1977 .

[16]  Thomas Brox,et al.  Uncertainty Estimates and Multi-hypotheses Networks for Optical Flow , 2018, ECCV.

[17]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  R. Hetherington The Perception of the Visual World , 1952 .

[20]  D. G. Albrecht,et al.  Visual cortical neurons: are bars or gratings the optimal stimuli? , 1980, Science.

[21]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[22]  H. Komatsu The neural mechanisms of perceptual filling-in , 2006, Nature Reviews Neuroscience.

[23]  David J. Fleet,et al.  Computation of normal velocity from local phase information , 1989, Proceedings CVPR '89: IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Gregory C. DeAngelis,et al.  Receptive-field dynamics in the central visual pathways , 1995, Trends in Neurosciences.

[26]  RussLL L. Ds Vnlos,et al.  SPATIAL FREQUENCY SELECTIVITY OF CELLS IN MACAQUE VISUAL CORTEX , 2022 .

[27]  R. L. Valois,et al.  The orientation and direction selectivity of cells in macaque visual cortex , 1982, Vision Research.

[28]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[29]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[30]  D. Ruderman,et al.  Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[31]  H. Neumann,et al.  The Role of Attention in Figure-Ground Segregation in Areas V1 and V4 of the Visual Cortex , 2012, Neuron.

[32]  A. Borst,et al.  Fly motion vision. , 2010, Annual review of neuroscience.

[33]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[34]  Jan Kautz,et al.  Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  H. C. Longuet-Higgins,et al.  The interpretation of a moving retinal image , 1980, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[37]  Xiaoou Tang,et al.  A Lightweight Optical Flow CNN —Revisiting Data Fidelity and Regularization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  P. Anandan,et al.  A computational framework and an algorithm for the measurement of visual motion , 1987, International Journal of Computer Vision.

[39]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[40]  L. Pessoa,et al.  Filling-in: From perceptual completion to cortical reorganization. , 2003 .

[41]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[42]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[44]  David J. Heeger,et al.  Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[45]  L. Palmer,et al.  Receptive-field structure in cat striate cortex. , 1981, Journal of neurophysiology.

[46]  J. P. Jones,et al.  The two-dimensional spectral structure of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[47]  Aamir Saeed Malik,et al.  An evaluation of optical flow algorithms for crowd analytics in surveillance system , 2016, 2016 6th International Conference on Intelligent and Advanced Systems (ICIAS).

[48]  Joachim Weickert,et al.  Universität Des Saarlandes Fachrichtung 6.1 – Mathematik Optic Flow in Harmony Optic Flow in Harmony Optic Flow in Harmony , 2022 .

[49]  I. Ohzawa,et al.  Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. II. Linearity of temporal and spatial summation. , 1993, Journal of neurophysiology.

[50]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[51]  N. Petkov,et al.  Motion detection, noise reduction, texture suppression, and contour enhancement by spatiotemporal Gabor filters with surround inhibition , 2007, Biological Cybernetics.

[52]  Cordelia Schmid,et al.  EpicFlow: Edge-preserving interpolation of correspondences for optical flow , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[54]  Bolei Zhou,et al.  Understanding Intra-Class Knowledge Inside CNN , 2015, ArXiv.

[55]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[56]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[57]  Lior Wolf,et al.  InterpoNet, a Brain Inspired Neural Network for Optical Flow Dense Interpolation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Ajit Singh,et al.  Optic flow computation : a unified perspective , 1991 .

[59]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[60]  D. Regan,et al.  Looming detectors in the human visual pathway , 1978, Vision Research.

[61]  Michael J. Black,et al.  Attacking Optical Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[62]  Baoxin Li,et al.  A survey of variational and CNN-based optical flow techniques , 2019, Signal Process. Image Commun..