3D head tracking based on recognition and interpolation using a time-of-flight depth sensor

This paper describes a head-tracking algorithm that is based on recognition and correlation-based weighted interpolation. The input is a sequence of 3D depth images generated by a novel time-of-flight depth sensor. These are processed to segment the background and foreground, and the latter is used as the input to the head tracking algorithm, which is composed of three major modules: First, a depth signature is created out of the depth images. Next, the signature is compared against signatures that are collected in a training set of depth images. Finally, a correlation metric is calculated between most possible signature hits. The head location is calculated by interpolating among stored depth values, using the correlation metrics as the weights. This combination of depth sensing and recognition-based head tracking provides more than 90 percent success. Even if the track is temporarily lost, it is easily recovered when a good match is obtained from the training set. The use of depth images and recognition-based head tracking achieves robust real-time tracking results under extreme conditions such as 180-degree rotation, temporary occlusions, and complex backgrounds.

[1]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[2]  Michael G. Strintzis,et al.  Real-time head tracking and 3D pose estimation from range data , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[3]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[4]  Peter Eisert,et al.  Analyzing Facial Expressions for Virtual Conferencing , 1998, IEEE Computer Graphics and Applications.

[5]  Ruigang Yang,et al.  Model-based head pose tracking with stereovision , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[6]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Carlo Tomasi,et al.  Shape recognition with application to medical imaging , 2002 .

[8]  Carlo Tomasi,et al.  3D tracking = classification + interpolation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Stanley T. Birchfield,et al.  Elliptical head tracking using intensity gradients and color histograms , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[10]  Bernd Girod,et al.  Model-based face tracking for view-independent facial expression recognition , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[11]  Radek Grzeszczuk,et al.  A data-driven model for monocular face tracking , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  Hiroshi Murase,et al.  Subspace methods for robot vision , 1996, IEEE Trans. Robotics Autom..

[13]  Dimitris N. Metaxas,et al.  The integration of optical flow and deformable models with applications to human face shape and motion estimation , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Peter Eisert,et al.  Model-based estimation of facial expression parameters from image sequences , 1997, Proceedings of International Conference on Image Processing.

[15]  Dimitris N. Metaxas,et al.  Deformable model-based face shape and motion estimation , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.