Camera and microphone array for 3D audiovisual face data collection

This paper proposes a novel camera/microphone array system capable of capturing dynamic facial expression video with synchronized speech and reconstructing realistic 3D face models from the data. Both hardware and software issues including camera calibration, video/audio synchronization, facial marker tracking and 3D shape reconstruction are considered. To our best knowledge, this system is the first camera/microphone array system that is able to capture high-resolution facial expression video with synchronized speech. The system can be used to collect dynamic 3D audiovisual face data for many multimedia applications.

[1]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[2]  Jon Barker Tracking facial markers with an adaptive marker collocation model , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Patrick J. Flynn,et al.  An evaluation of multimodal 2D+3D face biometrics , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yu Luo,et al.  A multi-stream audio-video large-vocabulary Mandarin Chinese speech database , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[5]  Xun Xu,et al.  Building Large Scale 3D Face Database for Face Analysis , 2007, MCAM.

[6]  Patrick J. Flynn,et al.  Overview of the face recognition grand challenge , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).