Learning to Track Multiple People in Omnidirectional Video

Meetings are a very important part of everyday life for professionals working in universities, companies or governmental institutions. We have designed a physical awareness system called CAMEO (Camera Assisted Meeting Event Observer), a hardware/software system to record and monitor people's activities in meetings. CAMEO captures a high resolution omnidirectional view of the meeting by stitching images coming from almost concentric cameras. Besides recording capability, CAMEO automatically detects people and learns a person-specific facial appearance model (PS-FAM) for each of the participants. The PSFAMs allow more robust/reliable tracking and identification. In this paper, we describe the video-capturing device, photometric/geometric autocalibration process, and the multiple people tracking system. The effectiveness and robustness of the proposed system is demonstrated over several real-time experiments and a large data set of videos.

[1]  Anoop Gupta,et al.  Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.

[2]  Janne Heikkilä,et al.  A four-step camera calibration procedure with implicit image correction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Shmuel Peleg,et al.  Panoramic mosaics by manifold projection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  M. Veloso,et al.  Using Sparse Visual Data to Model Human Activities in Meetings , 2004 .

[6]  Henry Schneiderman,et al.  Learning a restricted Bayesian network for object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Gérard G. Medioni,et al.  GlobeAll: panoramic video for an intelligent room , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[8]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[10]  Max Van Kleek,et al.  Virtual mouse vision based interface , 2004, IUI '04.

[11]  Anoop Gupta,et al.  Viewing meeting captured by an omni-directional camera , 2001, CHI.

[12]  Henry Schneiderman,et al.  Feature-centric evaluation for efficient cascaded object detection , 2004, CVPR 2004.

[13]  Franc Solina,et al.  Panoramic Depth Imaging: Single Standard Camera Approach , 2002, International Journal of Computer Vision.

[14]  Michael J. Black,et al.  Robust parameterized component analysis: theory and applications to 2D facial appearance models , 2003, Comput. Vis. Image Underst..

[15]  Brett Browning,et al.  CAMEO: Camera Assisted Meeting Event Observer , 2007 .

[16]  Richard Szeliski,et al.  Creating full view panoramic image mosaics and environment maps , 1997, SIGGRAPH.

[17]  W. Krzanowski Between-Groups Comparison of Principal Components , 1979 .

[18]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Mohan M. Trivedi,et al.  Activity monitoring and summarization for an intelligent meeting room , 2000, Proceedings Workshop on Human Motion.

[20]  Don Kimber,et al.  FlyCam: practical panoramic video and automatic camera control , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).