Anthropocentric Video Segmentation for Lecture Webcasts

Many lecture recording and presentation systems transmit slides or chalkboard content along with a small video of the instructor. As a result, two areas of the screen are competing for the viewer's attention, causing the widely known split-attention effect. Face and body gestures, such as pointing, do not appear in the context of the slides or the board. To eliminate this problem, this article proposes to extract the lecturer from the video stream and paste his or her image onto the board or slide image. As a result, the lecturer acting in front of the board or slides becomes the center of attention. The entire lecture presentation becomes more human-centered. This article presents both an analysis of the underlying psychological problems and an explanation of signal processing techniques that are applied in a concrete system. The presented algorithm is able to extract and overlay the lecturer online and in real time at full video resolution.

[1]  Jitendra Malik,et al.  A real-time computer vision system for measuring traffic parameters , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Elaine Toms,et al.  User strategies for handling information tasks in webcasts , 2005, CHI EA '05.

[3]  Liang-Gee Chen,et al.  Automatic video segmentation for MPEG-4 using predictive watershed , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[4]  Gerald Friedland,et al.  To See or Not To See: Layout Constraints, the Split Attention Problem and their Implications for the Design of Web Lecture Interfaces , 2006 .

[5]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[6]  Philip J. Kellman,et al.  Ontogenesis of Space and Motion Perception , 1995 .

[7]  John C. Tang,et al.  Videodraw: a video interface for collaborative drawing , 1991, TOIS.

[8]  Wolfgang Hürst,et al.  The AOF ( Authoring on the Fly ) system as an example for efficient and comfortable browsing and access of multimedia data , 2001 .

[9]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[10]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[11]  Yang Wang,et al.  Video segmentation based on graphical models , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Carman Neustaedter,et al.  Embodiments for Mixed Presence Groupware , 2004 .

[13]  Raúl Rojas,et al.  A Practical Approach to Boundary Accurate Multi-Object Extraction from Still Images and Videos , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[14]  Qi Tian,et al.  Foreground object detection from videos containing complex background , 2003, MULTIMEDIA '03.

[15]  Christian Breiteneder,et al.  Virtual Studios: An Overview , 1998, IEEE Multim..

[16]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[17]  Gregory D. Abowd,et al.  Classroom 2000: An Experiment with the Instrumentation of a Living Educational Environment , 1999, IBM Syst. J..

[18]  Masood Masoodian,et al.  Use of Video Shadow for Small Group Interaction Awareness on a Large Interactive Display Surface , 2003, AUIC.

[19]  A. Linksz Outlines of a Theory of the Light Sense. , 1965 .

[20]  D. Jameson,et al.  An opponent-process theory of color vision. , 1957, Psychological review.

[21]  John C. Tang,et al.  VideoWhiteboard: video shadows to support remote collaboration , 1991, CHI.

[22]  P. Chandler,et al.  Cognitive load as a factor in the structuring of technical material. , 1990 .

[23]  Raúl Rojas,et al.  SIOX: simple interactive object extraction in still images , 2005, Seventh IEEE International Symposium on Multimedia (ISM'05).

[24]  Wen Gao,et al.  A new method to segment playfield and its applications in match analysis in sports video , 2004, MULTIMEDIA '04.

[25]  P. Chandler,et al.  THE SPLIT‐ATTENTION EFFECT AS A FACTOR IN THE DESIGN OF INSTRUCTION , 1992 .

[26]  Beng Chin Ooi,et al.  Fast image retrieval using color-spatial information , 1998, The VLDB Journal.

[27]  Nicolas Roussel,et al.  Exploring New Uses of Video with VideoSpace , 2001, EHCI.

[28]  Gerald Friedland Adaptive audio and video processing for electronic chalkboard lectures , 2006 .

[29]  Graham Cooper,et al.  Cognitive load theory as an aid for instructional design , 1990 .

[30]  Michael Gleicher,et al.  MARKER AND CHALKBOARD REGIONS , 2005 .

[31]  R. Krauss,et al.  The Communicative Value of Conversational Hand Gesture , 1995 .

[32]  Gunther Wyszecki,et al.  Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition , 2000 .

[33]  Carman Neustaedter,et al.  VideoArms: Embodiments for Mixed Presence Groupware , 2007 .

[34]  Sven Behnke,et al.  Robust Real Time Color Tracking , 2000, RoboCup.

[35]  Margaret Gwendoline Riseborough,et al.  Physiographic gestures as decoding facilitators: Three experiments exploring a neglected facet of communication , 1981 .

[36]  Gerald Friedland,et al.  Teaching with an intelligent electronic chalkboard , 2004, ETP '04.

[37]  Michael Gleicher,et al.  Towards virtual videography (poster session) , 2000, ACM Multimedia.

[38]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[39]  Michael Gleicher,et al.  Towards Virtual Videography , 2000 .

[40]  J. Cohen,et al.  Color Science: Concepts and Methods, Quantitative Data and Formulas , 1968 .

[41]  Spencer D. Kelly,et al.  Gesture and right hemisphere involvement in evaluating lecture material , 2004 .

[42]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[43]  Mario A. Nascimento,et al.  Color-based image retrieval using binary signatures , 2002, SAC '02.

[44]  Bernhard Hill,et al.  Comparative analysis of the quantization of color spaces on the basis of the CIELAB color-difference formula , 1997, TOGS.

[45]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[46]  Liyuan Li,et al.  Integrating intensity and texture differences for robust change detection , 2002, IEEE Trans. Image Process..

[47]  Anoop Gupta,et al.  Building an intelligent camera management system , 2001, MULTIMEDIA '01.

[48]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..