Online Sparse Gaussian Process Regression and Its Applications

We present a new Gaussian process (GP) inference algorithm, called online sparse matrix Gaussian processes (OSMGP), and demonstrate its merits by applying it to the problems of head pose estimation and visual tracking. The OSMGP is based upon the observation that for kernels with local support, the Gram matrix is typically sparse. Maintaining and updating the sparse Cholesky factor of the Gram matrix can be done efficiently using Givens rotations. This leads to an exact, online algorithm whose update time scales linearly with the size of the Gram matrix. Further, we provide a method for constant time operation of the OSMGP using matrix downdates. The downdates maintain the Cholesky factor at a constant size by removing certain rows and columns corresponding to discarded training examples. We demonstrate that, using these matrix downdates, online hyperparameter estimation can be included at cost linear in the number of total training examples. We describe a robust appearance-based head pose estimation system based upon the OSMGP. Numerous experiments and comparisons with existing methods using a large dataset system demonstrate the efficiency and accuracy of our system. Further, to showcase the applicability of OSMGP to a wide variety of problems, we also describe a regression-based visual tracking method. Experiments show that our OSMGP algorithm generalizes well using online learning.

[1]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Carlo Tomasi,et al.  3D head tracking based on recognition and interpolation using a time-of-flight depth sensor , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[4]  David Beymer,et al.  Face recognition under varying pose , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Alex Pentland,et al.  Computer Vision for Human–Machine Interaction: Acknowledgements , 1998 .

[6]  Å. Björck,et al.  Accurate Downdating of Least Squares Solutions , 1994, SIAM J. Matrix Anal. Appl..

[7]  Timothy A. Davis,et al.  A column approximate minimum degree ordering algorithm , 2000, TOMS.

[8]  Johan A. K. Suykens,et al.  Compactly Supported RBF Kernels for Sparsifying the Gram Matrix in LS-SVM Regression Models , 2002, ICANN.

[9]  Trevor Darrell,et al.  3D pose tracking with linear depth and brightness constraints , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Timothy F. Cootes,et al.  View-based active appearance models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[11]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[12]  M. Opper Sparse Online Gaussian Processes , 2008 .

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[16]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[17]  Larry S. Davis,et al.  Constraint Integration for Efficient Multiview Pose Estimation with Self-Occlusions , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yun Fu,et al.  Head pose estimation: Classification or regression? , 2008, 2008 19th International Conference on Pattern Recognition.

[19]  Bernhard Schölkopf,et al.  Kernel machine based learning for multi-view face detection and pose estimation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[20]  Zhiwei Zhu,et al.  Robust real-time eye detection and tracking under variable lighting conditions and various face orientations , 2005, Comput. Vis. Image Underst..

[21]  Fuyun Ling,et al.  Sliding window order-recursive least-squares algorithms , 1994, IEEE Trans. Signal Process..

[22]  Alex Pentland,et al.  Looking at People: Sensing for Ubiquitous and Wearable Computing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  J. Weston,et al.  Approximation Methods for Gaussian Process Regression , 2007 .

[24]  Trevor Darrell,et al.  Active face tracking and pose estimation in an interactive room , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Ying Wu,et al.  Wide-range, person- and illumination-insensitive head orientation estimation , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[26]  Lisa M. Brown,et al.  Comparative study of coarse head pose estimation , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[27]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Yun Fu,et al.  Real-Time Multimodal Human–Avatar Interaction , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Radek Grzeszczuk,et al.  A data-driven model for monocular face tracking , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[30]  Ruigang Yang,et al.  Model-based head pose tracking with stereovision , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[31]  Carlos Hitoshi Morimoto,et al.  Pupil detection and tracking using multiple light sources , 2000, Image Vis. Comput..

[32]  Takeo Kanade,et al.  Rotation invariant neural network-based face detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[33]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[34]  Luc Van Gool,et al.  Real-time face pose estimation from single range images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[36]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[37]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[38]  Frank Dellaert,et al.  Fast Incremental Square Root Information Smoothing , 2007, IJCAI.

[39]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[40]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[41]  Larry S. Davis,et al.  Computing 3-D head orientation from a monocular image sequence , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[42]  Alex Pentland,et al.  Parametrized structure from motion for 3D adaptive feedback tracking of faces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Timothy A. Davis,et al.  Row Modifications of a Sparse Cholesky Factorization , 2005, SIAM J. Matrix Anal. Appl..

[44]  Jing Xiao,et al.  Real-time combined 2D+3D active appearance models , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[45]  Alex Pentland,et al.  View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Larry S. Davis,et al.  Computing 3-D head orientation from a monocular image sequence , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[47]  Andrew Blake,et al.  Sparse Bayesian learning for efficient visual tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Mohan M. Trivedi,et al.  Driver Behavior and Situation Aware Brake Assistance for Intelligent Vehicles , 2007, Proceedings of the IEEE.

[49]  Matthias W. Seeger,et al.  Bayesian Gaussian process models : PAC-Bayesian generalisation error bounds and sparse approximations , 2003 .

[50]  Alexander H. Waibel,et al.  Modeling focus of attention for meeting indexing based on multiple cues , 2002, IEEE Trans. Neural Networks.

[51]  William T. Freeman,et al.  Example-based head tracking , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[52]  Björn Stenger,et al.  Multivariate Relevance Vector Machines for Tracking , 2006, ECCV.

[53]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Trevor Darrell,et al.  Head gestures for perceptual interfaces: The role of context in improving recognition , 2007, Artif. Intell..