On application-unbiased benchmarking of web videos from a social network perspective

Along with the emerging focus of community-contributed videos on the web, there is a strong demand of a well-designed web video benchmark for the research of social network based video content analysis. The existing video datasets are challenged in two aspects: (1) as the data resource, most of them are narrowed for a specific task, either focusing on one content analysis task with limited scales, or focusing on the pure social network analysis without downloading video content. (2) as the evaluation platform, few of them pay attention to the potential bias introduced by the sampling criteria, therefore cannot fairly measure the task performance. In this paper, we release a large-scale web video benchmark named MCG-WEBV 2.0, which crawls 248,887 YouTube videos and their corresponding social network structure with 123,063 video contributors. MCG-WEBV 2.0 can be used to explore the fusion between content and network for several web video analysis tasks. Based on MCG-WEBV 2.0, we further explore the sampling bias lies in web video benchmark construction. While sampling a completely unbiased video benchmark from million-scale collection is unpractical, we propose a task-dependent measurement of such bias, which minimizes the correlation between the potential video sampling bias and the corresponding content analysis task, if such bias is unavoidable. Following this principle, we have shown several exemplar application scenarios in MCG-WEBV 2.0.

[1]  Jiangchuan Liu,et al.  Statistics and Social Network of YouTube Videos , 2008, 2008 16th Interntional Workshop on Quality of Service.

[2]  Meng Wang,et al.  Detecting Group Activities With Multi-Camera Context , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[4]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[5]  Yuan Ding,et al.  Broadcast yourself: understanding YouTube uploaders , 2011, IMC '11.

[6]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[7]  MahantiAnirban,et al.  Characterizing and modelling popularity of user-generated videos , 2011 .

[8]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[9]  Yi Yang,et al.  Interactive Video Indexing With Statistical Active Learning , 2012, IEEE Transactions on Multimedia.

[10]  Lifeng Sun,et al.  Web video topic discovery and tracking via bipartite graph reinforcement model , 2008, WWW.

[11]  Bin Liu,et al.  On the tag localization of web video , 2014, Multimedia Systems.

[12]  Ben Y. Zhao,et al.  Understanding user behavior in large-scale video-on-demand systems , 2006, EuroSys.

[13]  Yongdong Zhang,et al.  Web Video Geolocation by Geotagged Social Resources , 2012, IEEE Transactions on Multimedia.

[14]  Xuelong Li,et al.  Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[15]  Yongdong Zhang,et al.  Leveraging collective wisdom for web video retrieval through heterogeneous community discovery , 2011, MM '11.

[16]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[17]  Virgílio A. F. Almeida,et al.  Video interactions in online video social networks , 2009, TOMCCAP.

[18]  Meng Wang,et al.  Visual query suggestion , 2010, ACM Trans. Multim. Comput. Commun. Appl..

[19]  Meng Wang,et al.  Visual query suggestion , 2009, ACM Multimedia.

[20]  Shih-Fu Chang,et al.  Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.

[21]  Niklas Carlsson,et al.  Server selection in large-scale video-on-demand systems , 2010, TOMCCAP.

[22]  Yongdong Zhang,et al.  Tracking Web Video Topics: Discovery, Visualization, and Monitoring , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Yongdong Zhang,et al.  A Unified Geolocation Framework for Web Videos , 2014, TIST.

[24]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Cheng Huang,et al.  Can internet video-on-demand be profitable? , 2007, SIGCOMM '07.

[26]  Zongpeng Li,et al.  Youtube traffic characterization: a view from the edge , 2007, IMC '07.