Uploader Intent for Online Video: Typology, Inference, and Applications

We investigate automatic inference of uploader intent for online video, i.e., prediction of the reason for which a user has uploaded a particular video to the Internet. Users upload video for specific reasons, but rarely state these reasons explicitly in the video metadata. Information about the reasons motivating uploaders has the potential ultimately to benefit a wide range of application areas, including video production, video-based advertising , and video search. In this paper, we apply a combination of social-Web mining and crowdsourcing to arrive at a typology that characterizes the uploader intent of a broad range of videos. We then use a set of multimodal features, including visual semantic features, found to be indicative of uploader intent in order to classify videos automatically into uploader intent classes. We evaluate our approach on a dataset containing ca. 3K crowdsourcing-annotated videos and demonstrate its usefulness in prediction tasks relevant to common application areas.

[1]  Mathias Lux,et al.  Why did you record this video? An exploratory study on user intentions for video production , 2012, 2012 13th International Workshop on Image Analysis for Multimedia Interactive Services.

[2]  Rongrong Ji,et al.  SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content , 2013, ACM Multimedia.

[3]  Martha Larson,et al.  How 'How' Reflects What's What: Content-based Exploitation of How Users Frame Social Images , 2014, ACM Multimedia.

[4]  Martha Larson,et al.  Intent and its discontents: the user at the wheel of the online video search engine , 2012, ACM Multimedia.

[5]  Malcolm Williams Interpretivism and Generalisation , 2000 .

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Cees Snoek,et al.  Video2Sentence and vice versa , 2013, MM '13.

[8]  Louise Barkhuus,et al.  Video microblogging: your 12 seconds of fame , 2010, CHI Extended Abstracts.

[9]  Jiangchuan Liu,et al.  Understanding the Characteristics of Internet Short Video Sharing: A YouTube-Based Measurement Study , 2013, IEEE Transactions on Multimedia.

[10]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[11]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  Shih-Fu Chang,et al.  Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.

[13]  Venkata Rama Kiran Garimella,et al.  Inferring audience partisanship for YouTube videos , 2013, WWW '13 Companion.

[14]  Abigail Sellen,et al.  The ubiquitous camera: an in-depth study of camera phone use , 2005, IEEE Pervasive Computing.

[15]  Jussara M. Almeida,et al.  Using early view patterns to predict the popularity of youtube videos , 2013, WSDM.

[16]  Mubarak Shah,et al.  High-level event recognition in unconstrained videos , 2013, International Journal of Multimedia Information Retrieval.

[17]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Dong Liu,et al.  Event-Driven Semantic Concept Discovery by Exploiting Weakly Tagged Internet Images , 2014, ICMR.

[19]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[20]  Mathias Lux,et al.  A classification scheme for user intentions in image search , 2010, CHI EA '10.

[21]  Hila Becker,et al.  Hip and trendy: Characterizing emerging trends on Twitter , 2011, J. Assoc. Inf. Sci. Technol..

[22]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[23]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Alan Hanjalic,et al.  Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[25]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[26]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[27]  Stevan Rudinac,et al.  Leveraging visual concepts and query performance prediction for semantic-theme-based video retrieval , 2012, International Journal of Multimedia Information Retrieval.

[28]  Venkata Rama Kiran Garimella,et al.  Who watches (and shares) what on youtube? and when?: using twitter to understand youtube viewership , 2013, WSDM.

[29]  Flavio Figueiredo,et al.  The tube over time: characterizing popularity growth of youtube videos , 2011, WSDM '11.

[30]  Mohammad Soleymani,et al.  Automatic tagging and geotagging in video collections and communities , 2011, ICMR.

[31]  Wolfgang Nejdl,et al.  How useful are your comments?: analyzing and predicting youtube comments and comment ratings , 2010, WWW '10.

[32]  Namkee Park,et al.  Intention to upload video content on the internet: The role of social norms and ego-involvement , 2011, Comput. Hum. Behav..

[33]  Thomas Sikora,et al.  Cross-modal categorisation of user-generated video sequences , 2012, ICMR '12.

[34]  Martha Larson,et al.  Intent-Aware Video Search Result Optimization , 2014, IEEE Transactions on Multimedia.

[35]  Matthijs J. Warrens,et al.  Inequalities between multi-rater kappas , 2010, Adv. Data Anal. Classif..

[36]  Marco Emanuele Campanella,et al.  Understanding behaviors and needs for home videos , 2008 .

[37]  Jiangchuan Liu,et al.  Understanding the Characteristics of Internet Short Video Sharing: YouTube as a Case Study , 2007, ArXiv.

[38]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..