Depth Recovery with Face Priors

Existing depth recovery methods for commodity RGB-D sensors primarily rely on low-level information for repairing the measured depth estimates. However, as the distance of the scene from the camera increases, the recovered depth estimates become increasingly unreliable. The human face is often a primary subject in the captured RGB-D data in applications such as the video conference. In this paper we propose to incorporate face priors extracted from a general sparse 3D face model into the depth recovery process. In particular, we propose a joint optimization framework that consists of two main steps: deforming the face model for better alignment and applying face priors for improved depth recovery. The two main steps are iteratively and alternatively operated so as to help each other. Evaluations on benchmark datasets demonstrate that the proposed method with face priors significantly outperforms the baseline method that does not use face priors, with up to 15.1 % improvement in depth recovery quality and up to 22.3 % in registration accuracy.

[1]  Dimitris N. Metaxas,et al.  Optical Flow Constraints on Deformable Models with Applications to Face Tracking , 2000, International Journal of Computer Vision.

[2]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[3]  Hans-Peter Seidel,et al.  Coherent Spatiotemporal Filtering, Upsampling and Rendering of RGBZ Videos , 2012, Comput. Graph. Forum.

[4]  Fadi Dornaika,et al.  Real time 3D face and facial feature tracking , 2007, Journal of Real-Time Image Processing.

[5]  Jörgen Ahlberg Using the active appearance algorithm for face and facial feature tracking , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[6]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[7]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[8]  Maja Pantic,et al.  Hierarchical On-line Appearance-Based Tracking for 3D head pose, eyebrows, lips, eyelids and irises , 2013, Image Vis. Comput..

[9]  Kun Li,et al.  Depth Recovery Using an Adaptive Color-Guided Auto-Regressive Model , 2012, ECCV.

[10]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  Fadi Dornaika,et al.  Fast and reliable active appearance model search for 3-D face tracking , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Minh N. Do,et al.  Depth Video Enhancement Based on Weighted Mode Filtering , 2012, IEEE Transactions on Image Processing.

[15]  K. S. Arun,et al.  Least-Squares Fitting of Two 3-D Point Sets , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Henry Fuchs,et al.  Encumbrance-free telepresence system with real-time 3D capture and display using commodity depth cameras , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[17]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[18]  Guangming Shi,et al.  Structure guided fusion for depth map inpainting , 2013, Pattern Recognit. Lett..

[19]  Dong Tian,et al.  Depth map processing with iterative joint multilateral filtering , 2010, 28th Picture Coding Symposium.

[20]  Guido M. Cortelazzo,et al.  Microsoft Kinect™ Range Camera , 2012 .

[21]  Chieh-Chih Wang,et al.  3D AAM based face alignment under wide angular variations using 2D and 3D data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22]  Michael F. Cohen,et al.  Digital photography with flash and no-flash image pairs , 2004, ACM Trans. Graph..

[23]  Hai Xuan Pham,et al.  Hybrid On-Line 3D Face and Facial Actions Tracking in RGBD Video Sequences , 2014, 2014 22nd International Conference on Pattern Recognition.

[24]  Jihun Yu,et al.  Realtime facial animation with on-the-fly correctives , 2013, ACM Trans. Graph..

[25]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[26]  Chongyu Chen,et al.  A color-guided, region-adaptive and depth-selective unified framework for Kinect depth recovery , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[27]  Kok-Lim Low Linear Least-Squares Optimization for Point-to-Plane ICP Surface Registration , 2004 .

[28]  Philip A. Chou,et al.  Viewport: A Distributed, Immersive Teleconferencing System with Infrared Dot Pattern , 2013, IEEE MultiMedia.

[29]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[30]  Zhengyou Zhang,et al.  3D Deformable Face Tracking with a Commodity Depth Camera , 2010, ECCV.

[31]  Markus H. Gross,et al.  FreeCam: A Hybrid Camera System for Interactive Free-Viewpoint Video , 2011, VMV.

[32]  Tat-Jen Cham,et al.  High-quality Kinect depth filtering for real-time 3D telepresence , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[33]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[34]  Jörgen Ahlberg AN UPDATED PARAMETERISED FACE , 2001 .