Gaze manipulation for one-to-one teleconferencing

A new algorithm is proposed for novel view generation in one-to-one teleconferencing applications. Given the video streams acquired by two cameras placed on either side of a computer monitor, the proposed algorithm synthesizes images from a virtual camera in arbitrary position (typically located within the monitor) to facilitate eye contact. Our technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel-view generation. The two main contributions are: i) a new type of three-plane graph for dense-stereo dynamic-programming, that encourages correct occlusion labeling; ii) a compact geometric derivation for novel-view synthesis by direct projection of the minimum-cost surface. Furthermore, we present a novel algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly on our three-dimensional matching cost space. Examples are given that demonstrate the robustness of the new algorithm to spatial and temporal artefacts for long stereo video streams. These include demonstrations of synthesis of cyclopean views of extended conversational sequences. We further demonstrate synthesis from a freely translating virtual camera.

[1]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[2]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Thomas Vetter,et al.  Synthesis of Novel Views from a Single Face Image , 1998, International Journal of Computer Vision.

[4]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[5]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Vladimir Kolmogorov,et al.  Multi-camera Scene Reconstruction via Graph Cuts , 2002, ECCV.

[7]  Ingemar J. Cox,et al.  A Maximum Likelihood Stereo Algorithm , 1996, Comput. Vis. Image Underst..

[8]  Changming Sun,et al.  Fast Stereo Matching Using Rectangular Subregioning and 3D Maximum-Surface Techniques , 2002, International Journal of Computer Vision.

[9]  Kentaro Toyama,et al.  Gaze Awareness for Video-Conferencing: A Software Approach , 2000, IEEE Multim..

[10]  Ingemar J. Cox,et al.  A maximum-flow formulation of the N-camera stereo correspondence problem , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[11]  Ruigang Yang,et al.  Eye gaze correction with stereovision for video-teleconferencing , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.