Development and evaluation of wireless 3D video conference system using decision tree and behavior network

Video conferencing is a communication technology that allows multiple users to communicate with each other by both images and sound signals. As the performance of wireless network has improved, the data are transmitted in real time to mobile devices with the wireless network. However, there is the limit of the amount of the data to be transmitted. Therefore it is essential to devise a method to reduce data traffic. There are two general methods to reduce data rates: extraction of the user's image shape and the use of virtual humans in video conferencing. However, data rates in a wireless network remain high even if only the user's image shape is transferred. With the latter method, the virtual human may express a user's movement erroneously with insufficient information of body language or gestures. Hence, to conduct a video conference on a wireless network, a method to compensate for such erroneous actions is required. In this article, a virtual human-based video conference framework is proposed. To reduce data traffic, only the user's pose data are extracted from photographed images using an improved binary decision tree, after which they are transmitted to other users by using the markup language. Moreover, a virtual human executes behaviors to express a user's movement accurately by an improved behavior network according to the transmitted pose data. In an experiment, the proposed method is implemented in a mobile device. A 3-min video conference between two users was then analyzed, and the video conferencing process was described. Photographed images were converted into text-based markup language. Therefore, the transmitted amount of data could effectively be reduced. By using an improved decision tree, the user's pose can be estimated by an average of 5.1 comparisons among 63 photographed images carried out four times a second. An improved behavior network makes virtual human to execute diverse behaviors.

[1]  S. McConnell,et al.  Virtual conferencing , 1997 .

[2]  Emile A. Hendriks,et al.  A 3-D Telepresence Collaborative Working Environment , 2003 .

[3]  Dirk Heylen,et al.  Towards real-time Body Pose Estimation for Presenters in Meeting Environments , 2005, WSCG.

[4]  Bang Jun Lei,et al.  An efficient image-based telepresence system for videoconferencing , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Ramesh C. Jain,et al.  3D video generation with multiple perspective camera views , 1997, Proceedings of International Conference on Image Processing.

[6]  Saied Moezzi,et al.  Virtual View Generation for 3D Digital Video , 1997, IEEE Multim..

[7]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[8]  Anton Nijholt,et al.  Presenting in Virtual Worlds: Towards an Architecture for a 3D Presenter Explaining 2D-Presented Information , 2005, INTETAIN.

[9]  Sung-Bae Cho,et al.  A Mobile Intelligent Synthetic Character with Natural Behavior Generation , 2010, ICAART.

[10]  Takeo Kanade,et al.  Virtual Space Teleconferencing Using a Sea of Cameras , 1994 .

[11]  Anton Nijholt,et al.  Introducing an Embodied Virtual Presenter Agent in a Virtual Meeting Room , 2005, Artificial Intelligence and Applications.

[12]  Bang Jun Lei,et al.  Real-Time Multi-Step View Reconstruction for a Virtual Teleconference System , 2002, EURASIP J. Adv. Signal Process..

[13]  S. McConnell,et al.  Virtual conferencing : Telepresence , 1997 .