Multimedia retrieval through spatio-temporal activity maps

As multiple video cameras and other sensors generate very large quantities of multimedia data in media productions and surveillance applications, a key challenge is to identify the relevant portions of the data and to rapidly retrieve the corresponding sensor data. Spatio-temporal activity maps serve as an efficient and intuitive graphical user interface for multimedia retrieval, particularly when the media streams are derived from multiple sensors observing a physical environment. We formulate the media retrieval problem in this context, and develop an architecture for interactive media retrieval by combining spatio-temporal "activity maps" with domain specific event information. Activity maps are computed from trajectories of motion of objects in the environment, which in turn are derived automatically by analysis of sensor data. We present an activity map based video retrieval system for the sport of tennis and demonstrate that the activity map based scheme significantly helps the user in a) discovering the relevant portions of the data, and b) non-linearly retrieving the corresponding media streams.

[1]  Yves Jean,et al.  Real time tracking for enhanced tennis broadcasts , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[2]  Yves Jean,et al.  LucentVision: converting real world events into multimedia experiences , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[3]  Yves Jean,et al.  Ball tracking and virtual replays for innovative tennis broadcasts , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Anil K. Jain,et al.  Automatic classification of tennis video for high-level content-based retrieval , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[5]  Boon-Lock Yeo,et al.  Video content characterization and compaction for digital library applications , 1997, Electronic Imaging.

[6]  Ramesh C. Jain,et al.  An architecture for multiple perspective interactive video , 1995, MULTIMEDIA '95.

[7]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[8]  Larry S. Davis,et al.  W4S: A real-time system detecting and tracking people in 2 1/2D , 1998, ECCV.

[9]  Jakub Segen,et al.  A camera-based system for tracking people in real time , 1996, Proceedings of 13th International Conference on Pattern Recognition.