Robust Performance-driven 3D Face Tracking in Long Range Depth Scenes

We introduce a novel robust hybrid 3D face tracking framework from RGBD video streams, which is capable of tracking head pose and facial actions without pre-calibration or intervention from a user. In particular, we emphasize on improving the tracking performance in instances where the tracked subject is at a large distance from the cameras, and the quality of point cloud deteriorates severely. This is accomplished by the combination of a flexible 3D shape regressor and the joint 2D+3D optimization on shape parameters. Our approach fits facial blendshapes to the point cloud of the human head, while being driven by an efficient and rapid 3D shape regressor trained on generic RGB datasets. As an on-line tracking system, the identity of the unknown user is adapted on-the-fly resulting in improved 3D model reconstruction and consequently better tracking performance. The result is a robust RGBD face tracker, capable of handling a wide range of target scene depths, beyond those that can be afforded by traditional depth or RGB face trackers. Lastly, since the blendshape is not able to accurately recover the real facial shape, we use the tracked 3D face model as a prior in a novel filtering process to further refine the depth map for use in other tasks, such as 3D reconstruction.

[1]  Jörgen Ahlberg Using the active appearance algorithm for face and facial feature tracking , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[2]  Kun Zhou,et al.  3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[3]  Jihun Yu,et al.  Realtime facial animation with on-the-fly correctives , 2013, ACM Trans. Graph..

[4]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Depth Recovery with Face Priors , 2014, ACCV.

[6]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[7]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[8]  Kun Li,et al.  Depth Recovery Using an Adaptive Color-Guided Auto-Regressive Model , 2012, ECCV.

[9]  Tat-Jen Cham,et al.  High-quality Kinect depth filtering for real-time 3D telepresence , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[10]  Maja Pantic,et al.  Hierarchical On-line Appearance-Based Tracking for 3D head pose, eyebrows, lips, eyelids and irises , 2013, Image Vis. Comput..

[11]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[13]  Hai Xuan Pham,et al.  Hybrid On-Line 3D Face and Facial Actions Tracking in RGBD Video Sequences , 2014, 2014 22nd International Conference on Pattern Recognition.

[14]  Dimitris N. Metaxas,et al.  Consensus of Regression for Occlusion-Robust Facial Feature Localization , 2014, ECCV.

[15]  Hao Li,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[16]  Hans-Peter Seidel,et al.  Coherent Spatiotemporal Filtering, Upsampling and Rendering of RGBZ Videos , 2012, Comput. Graph. Forum.

[17]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[18]  Silvio Savarese,et al.  Dense Object Reconstruction with Semantic Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Zhengyou Zhang,et al.  3D Deformable Face Tracking with a Commodity Depth Camera , 2010, ECCV.

[20]  Timothy F. Cootes,et al.  Boosted Regression Active Shape Models , 2007, BMVC.

[21]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[22]  Kok-Lim Low Linear Least-Squares Optimization for Point-to-Plane ICP Surface Registration , 2004 .

[23]  Fadi Dornaika,et al.  Fast and reliable active appearance model search for 3-D face tracking , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Chongyu Chen,et al.  Kinect Depth Recovery Using a Color-Guided, Region-Adaptive, and Depth-Selective Framework , 2015, ACM Trans. Intell. Syst. Technol..

[25]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[26]  Minh N. Do,et al.  Depth Video Enhancement Based on Weighted Mode Filtering , 2012, IEEE Transactions on Image Processing.

[27]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[28]  Dimitris N. Metaxas,et al.  Optical Flow Constraints on Deformable Models with Applications to Face Tracking , 2000, International Journal of Computer Vision.

[29]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[30]  Guido M. Cortelazzo,et al.  Time-of-Flight Cameras and Microsoft Kinect™ , 2012, Springer Briefs in Electrical and Computer Engineering.

[31]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[32]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.