Human action analysis, annotation and modeling in video streams based on implicit user interaction

This paper proposes an integrated framework for analyzing human actions in video streams. Despite most current approaches that are just based on automatic spatiotemporal analysis of sequences, the proposed method introduces the implicit user-in-the-loop concept for dynamically mining semantics and annotating video streams. This work sets a new and ambitious goal: to recognize, model and properly use "average user's" selections, preferences and perception, for dynamically extracting content semantics. The proposed approach is expected to add significant value to hundreds of billions of non-annotated or inadequately annotated video streams existing in the Web, file servers, databases etc. Furthermore expert annotators can gain important knowledge relevant to user preferences, selections, styles of searching and perception.

[1]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Ambuj K. Singh,et al.  ViVo: visual vocabulary construction for mining biomedical images , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[3]  Anthony G. Constantinides,et al.  Visual attention based region of interest coding for video-telephony applications , 2006 .

[4]  Stefanos D. Kollias,et al.  On-line retrainable neural networks: improving the performance of neural networks in image analysis problems , 2000, IEEE Trans. Neural Networks Learn. Syst..

[5]  Keon Stevenson,et al.  Comparative evaluation of Web image search engines for multimedia applications , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[6]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Yannis Avrithis,et al.  Facial Image Indexing in Multimedia Databases , 2001, Pattern Analysis & Applications.

[8]  James Ze Wang,et al.  The Story Picturing Engine---a system for automatic text illustration , 2006, TOMCCAP.

[9]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[11]  Robert J. Marks,et al.  An adaptively trained neural network , 1991, IEEE Trans. Neural Networks.