Biologically Inspired Motion Encoding for Robust Global Motion Estimation

The growing use of cameras embedded in autonomous robotic platforms and worn by people is increasing the importance of accurate global motion estimation (GME). However, the existing GME methods may degrade considerably under illumination variations. In this paper, we address this problem by proposing a biologically inspired GME method that achieves high estimation accuracy in the presence of illumination variations. We mimic the early layers of the human visual cortex with the spatio-temporal Gabor motion energy by adopting the pioneering model of Adelson and Bergen, and we provide the closed-form expressions that enable the study and adaptation of this model to different application needs. Moreover, we propose a normalisation scheme for motion energy to tackle temporal illumination variations. Finally, we provide an overall GME scheme which, to the best of our knowledge, achieves the highest accuracy on the pose, illumination, and expression database.

[1]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  S. Negahdaripour,et al.  Relaxing the Brightness Constancy Assumption in Computing Optical Flow , 1987 .

[3]  Sridha Sridharan,et al.  Fourier Lucas-Kanade Algorithm , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Xiong Xiong,et al.  Linearly Estimating All Parameters of Affine Motion Using Radon Transform , 2014, IEEE Transactions on Image Processing.

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  J. Cohn,et al.  Deciphering the Enigmatic Face , 2005, Psychological science.

[7]  Mohammed Ghazal,et al.  Robust Global Motion Estimation Oriented to Video Object Segmentation , 2008, IEEE Transactions on Image Processing.

[8]  Ronald Parr,et al.  DP-SLAM: fast, robust simultaneous localization and mapping without predetermined landmarks , 2003, IJCAI 2003.

[9]  Nathalie Guyader,et al.  Parallel implementation of a spatio-temporal visual saliency model , 2010, Journal of Real-Time Image Processing.

[10]  Daniel Rueckert,et al.  Spatio-Temporal Free-Form Registration of Cardiac MR Image Sequences , 2004, MICCAI.

[11]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[12]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[13]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[14]  J.-K. Kamarainen,et al.  Noise tolerant object recognition using Gabor filtering , 2002, 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628).

[15]  D. Heeger Normalization of cell responses in cat striate cortex , 1992, Visual Neuroscience.

[16]  Frédo Durand,et al.  Eulerian video magnification for revealing subtle changes in the world , 2012, ACM Trans. Graph..

[17]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[18]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[19]  Óscar Martínez Mozos,et al.  A comparative evaluation of interest point detectors and local descriptors for visual SLAM , 2010, Machine Vision and Applications.

[20]  Tsuhan Chen,et al.  Fast image alignment in the Fourier domain , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Ming-Ting Sun,et al.  Global motion estimation from coarsely sampled motion vector field and the applications , 2003, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Dario Floreano,et al.  Flying Insects and Robots , 2010 .

[23]  Frédéric Dufaux,et al.  Efficient, robust, and fast global motion estimation for video coding , 2000, IEEE Trans. Image Process..

[24]  Maja Pantic,et al.  Generic Active Appearance Models Revisited , 2012, ACCV.

[25]  Michael J. Black,et al.  A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles Behind Them , 2013, International Journal of Computer Vision.

[26]  Hatice Gunes,et al.  Computational analysis of human-robot interactions through first-person vision: Personality and interaction experience , 2015, 2015 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[27]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[28]  Truong Q. Nguyen,et al.  Real-Time Affine Global Motion Estimation Using Phase Correlation and its Application for Digital Image Stabilization , 2011, IEEE Transactions on Image Processing.

[29]  Xueming Qian Global Motion Estimation and Its Applications , 2012 .

[30]  Tania Stathaki,et al.  Subpixel Registration With Gradient Correlation , 2011, IEEE Transactions on Image Processing.

[31]  Myung Jin Chung,et al.  Robust Online Digital Image Stabilization Based on Point-Feature Trajectory Without Accumulative Global Motion Estimation , 2012, IEEE Signal Processing Letters.

[32]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[33]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[34]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[35]  Stefanos Zafeiriou,et al.  Robust FFT-Based Scale-Invariant Image Registration with Image Gradients , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Filiberto Pla,et al.  Motion Analysis with the Radon Transform on Log-Polar Images , 2008, Journal of Mathematical Imaging and Vision.

[37]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Ivan V. Bajic,et al.  A Joint Approach to Global Motion Estimation and Motion Segmentation From a Coarsely Sampled Motion Vector Field , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[40]  Andrea Cavallaro,et al.  Probabilistic Subpixel Temporal Registration for Facial Expression Analysis , 2014, ACCV.

[41]  Krystian Mikolajczyk,et al.  Feature Tracking and Motion Compensation for Action Recognition , 2008, BMVC.

[42]  Aljoscha Smolic,et al.  Low-complexity global motion estimation from P-frame motion vectors for MPEG-7 applications , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[43]  Avinash C. Kak,et al.  Robust motion estimation under varying illumination , 2005, Image Vis. Comput..

[44]  David J. Kriegman,et al.  Structure and Motion from Images of Smooth Textureless Objects , 2004, ECCV.

[45]  Richard Bowden,et al.  Mutual Information for Lucas-Kanade Tracking (MILK): An Inverse Compositional Formulation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Manish Okade,et al.  Video stabilization using maximally stable extremal region features , 2012, Multimedia Tools and Applications.

[47]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[48]  Yann LeCun,et al.  Convolutional Learning of Spatio-temporal Features , 2010, ECCV.

[49]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[50]  Stefanos Zafeiriou,et al.  Robust and efficient parametric face alignment , 2011, 2011 International Conference on Computer Vision.

[51]  Ivan V. Bajic,et al.  Motion Vector Outlier Rejection Cascade for Global Motion Estimation , 2010, IEEE Signal Processing Letters.

[52]  C. Schmid,et al.  Hamming Embedding and Weak Geometry Consistency for Large Scale Image Search - extended version , 2008 .

[53]  Frank Tong,et al.  Foundations of Vision , 2018 .

[54]  N. Petkov,et al.  Motion detection, noise reduction, texture suppression, and contour enhancement by spatiotemporal Gabor filters with surround inhibition , 2007, Biological Cybernetics.

[55]  Thomas Sikora,et al.  Monte-Carlo-Based Parametric Motion Estimation Using a Hybrid Model Approach , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[56]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[57]  Bernd Girod,et al.  Motion-compensating prediction with fractional-pel accuracy , 1993, IEEE Trans. Commun..

[58]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[59]  Georgios D. Evangelidis,et al.  Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[61]  Wei Pan,et al.  An Adaptable-Multilayer Fractional Fourier Transform Approach for Image Registration , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .

[63]  Radu Horaud,et al.  A Differential Model of the Complex Cell , 2011, Neural Computation.