Synthetic and hybrid imaging in the HUMANOID and VIDAS projects

The research activity in natural/synthetic image processing and representation reported in this paper, initiated under the Esprit project HUMANOID and currently continued under the ACTS project VIDAS, concerns the application of virtual reality methodologies to interpersonal audio/video communication. The 3D videophone scene is modeled in video (the talker's face) and in audio (the talker's speech) so that natural data can be efficiently mixed with synthetic data and adapted onto deformable parameterized structures. Robust image analysis/synthesis tools are necessary to extract the visual primitives associated to the talker's face and to adapt them onto suitable modeling structures (wire-frames). Image/speech analysis performed at the transmitter provides suitable audio/video parameters which are encoded and used at the receiver to synthesize the corresponding facial expressions together with synchronized lip movements.