Technologies for building networked collaborative environments

In this paper, we begin by reporting our work in lip synchronization. We then expand on this technology to introduce our recent effort in developing a networked collaborative environment that integrates image analysis, face animation, and directional sound in order to provide a truly immersive environment, with the goal of replacing existing video conferencing platforms. A related technology, streaming of 3D objects, is also introduced.

[1]  Thoms M. Levergood,et al.  DEC face: an automatic lip-synchronization algorithm for synthetic faces , 1993 .

[2]  Tsuhan Chen,et al.  Audio-visual interaction in multimedia communication , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Tsuhan Chen,et al.  Audio-to-visual conversion for multimedia communication , 1998, IEEE Trans. Ind. Electron..

[4]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Hugues Hoppe,et al.  Progressive meshes , 1996, SIGGRAPH.

[6]  Alan Jeffrey Goldschen,et al.  Continuous automatic speech recognition by lipreading , 1993 .

[7]  Hans Peter Graf,et al.  Sample-based synthesis of photo-realistic talking heads , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[8]  Gerasimos Potamianos,et al.  Speaker independent audio-visual database for bimodal ASR , 1997, AVSP.

[9]  H.P. Graf,et al.  Lip synchronization using speech-assisted video processing , 1995, IEEE Signal Processing Letters.

[10]  Tsuhan Chen,et al.  Audio-visual integration in multimodal communication , 1998, Proc. IEEE.

[11]  Christoph Bregler,et al.  Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[12]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[13]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[14]  Parke,et al.  Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[15]  Tsuhan Chen,et al.  Real-time lip-synch face animation driven by human voice , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[16]  Tsuhan Chen,et al.  Coding of subregions for content-based scalable video , 1997, IEEE Trans. Circuits Syst. Video Technol..