A crowdsourcing approach to support video annotation

In this paper we present an innovative approach to support efficient large scale video annotation by exploiting the crowdsourcing. In particular, we collect big noisy annotations by an on-line Flash game which aims at taking photos of objects appearing through the game levels. The data gathered (suitably processed) from the game is then used to drive image segmentation approaches, namely the Region Growing and Grab Cut, which allow us to derive meaningful annotations. A comparison against hand-labeled ground truth data showed that the proposed approach constitutes a valid alternative to the existing video annotation approaches and allow a reliable and fast collection of large scale ground truth data for performance evaluation in computer vision.

[1]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[3]  P. Kohli,et al.  Efficiently solving dynamic Markov random fields using graph cuts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4]  Simone Palazzo,et al.  A semi-automatic tool for detection and tracking ground truth generation in videos , 2012, VIGTA '12.

[5]  Simone Palazzo,et al.  An innovative web-based collaborative platform for video annotation , 2014, Multimedia Tools and Applications.

[6]  Antonio Torralba,et al.  LabelMe video: Building a video database with human annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[8]  Concetto Spampinato,et al.  Generation of Ground Truth for Object Detection While Playing an Online Game: Productive Gaming or Recreational Working? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[9]  Luis von Ahn Games with a Purpose , 2006, Computer.

[10]  David S. Doermann,et al.  Tools and techniques for video performance evaluation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[11]  Deva Ramanan,et al.  Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces , 2010, ECCV.

[12]  Jane Yung-jen Hsu,et al.  KissKissBan: a competitive human computation game for image annotation , 2010, HCOMP '09.

[13]  Deva Ramanan,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[14]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[15]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[16]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[17]  Manuel Blum,et al.  Peekaboom: a game for locating objects in images , 2006, CHI.

[18]  Simone Palazzo,et al.  Covariance based Fish Tracking in Real-life Underwater Environment , 2018, VISAPP.