A best view selection in meetings through attention analysis using a multi-camera network

Human activity analysis is an essential task in ambient intelligence and computer vision. The main focus lies in the automatic analysis of ongoing activities from a multi-camera network. One possible application is meeting analysis which explores the dynamics in meetings using low-level data and inferring high-level activities. However, the detection of such activities is still very challenging due to the often corrupted or imprecise low-level data. In this paper, we present an approach to understand the dynamics in meetings using a multi-camera network, consisting of fixed ambient and portable close-up cameras. As a particular application we are aiming to find the most informative video stream, for example as a representative view for a remote participant. Our contribution is threefold: at first, we estimate the extrinsic parameters of the portable close-up cameras based on head positions. Secondly, we find common overlapping areas based on the consensus of people's orientation. And thirdly, the most informative view for a remote participant is estimated using common overlapping areas. We evaluated our proposed approach and compared it to a motion estimation method. Experimental results show that we can reach an accuracy of 74% compared to manually selected views.

[1]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[3]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Hamid K. Aghajan,et al.  Multiview social behavior analysis in work environments , 2011, 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras.

[5]  G. Farneback Fast and accurate motion estimation using orientation tensors and parametric motion models , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[6]  Wilfried Philips,et al.  Decentralized tracking of humans using a camera network , 2011, Electronic Imaging.

[7]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Wilfried Philips,et al.  PhD forum: Multi-view occupancy maps using a network of low resolution visual sensors , 2011, 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras.

[9]  Anoop Gupta,et al.  Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.

[10]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[11]  K. S. Arun,et al.  Least-Squares Fitting of Two 3-D Point Sets , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Wilfried Philips,et al.  Demo: Real-time indoors people tracking in scalable camera networks , 2011, 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras.

[13]  Wilfried Philips,et al.  Face Analysis Using Curve Edge Maps , 2011, ICIAP.

[14]  Mohan M. Trivedi,et al.  Activity monitoring and summarization for an intelligent meeting room , 2000, Proceedings Workshop on Human Motion.

[15]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Alexander H. Waibel,et al.  Modeling focus of attention for meeting indexing , 1999, MULTIMEDIA '99.

[17]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[18]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[19]  Pascal Fua,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Multiple Object Tracking Using K-shortest Paths Optimization , 2022 .

[20]  Yuichi Nakamura,et al.  Smart meeting systems: A survey of state-of-the-art and open issues , 2010, CSUR.

[21]  Jie Zhu,et al.  Head orientation and gaze direction in meetings , 2002, CHI Extended Abstracts.

[22]  Dennis R. Wixon,et al.  CHI '02 Extended Abstracts on Human Factors in Computing Systems , 2002, CHI 2002.