Deep learning from temporal coherence in video

This work proposes a learning method for deep architectures that takes advantage of sequential data, in particular from the temporal coherence that naturally exists in unlabeled video recordings. That is, two successive frames are likely to contain the same object or objects. This coherence is used as a supervisory signal over the unlabeled data, and is used to improve the performance on a supervised task of interest. We demonstrate the effectiveness of this method on some pose invariant object and face recognition tasks.

[1]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[2]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[3]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[4]  Suzanna Becker,et al.  Learning Temporally Persistent Hierarchical Representations , 1996, NIPS.

[5]  Suzanna Becker,et al.  Mutual information maximization: models of cortical self-organization. , 1996, Network.

[6]  Shree K. Nayar,et al.  Real-Time Focus Range Sensor , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Thomas G. Dietterich Adaptive computation and machine learning , 1998 .

[10]  M.M. Van Hulle,et al.  View-based 3D object recognition with support vector machines , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[11]  Suzanna Becker,et al.  Implicit Learning in 3D Object Recognition: The Importance of Temporal Context , 1999, Neural Computation.

[12]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[13]  Geoffrey E. Hinton,et al.  Unsupervised learning : foundations of neural computation , 1999 .

[14]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Terrence J. Sejnowski,et al.  Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.

[18]  Heinrich Niemann,et al.  A Spin-Glass Markov Random Field for 3-D Object Recognition , 2002 .

[19]  Heiko Wersing,et al.  Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[20]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[21]  Dimitris N. Metaxas,et al.  A hybrid face recognition method using Markov random fields , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[22]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[23]  Michael H. Bowling,et al.  Action respecting embedding , 2005, ICML.

[24]  Vikas Sindhwani,et al.  On Manifold Regularization , 2005, AISTATS.

[25]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[27]  Yaser Sheikh,et al.  Model generation for video-based object recognition , 2006, MM '06.

[28]  Bernhard Schölkopf,et al.  Semi-Supervised Learning (Adaptive Computation and Machine Learning) , 2006 .

[29]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[30]  David A. Forsyth,et al.  Building models of animals from video , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[32]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[33]  eon BottouAT Stochastic Gradient Learning in Neural Networks , 2022 .