论文信息 - Visual Learning of Socio-Video Semantics

Visual Learning of Socio-Video Semantics

Today's ubiquity of visual content as driven by the availability of broadband Internet, low-priced storage, and the omnipresence of camera equipped mobile devices conveys much of our thinking and feeling as individuals and as a society. As a result the growth of video repositories is increasing at enourmous rates with content now being embedded and shared through social media. To make use of this new form of social multimedia, concept detection, the automatic mapping of semantic concepts and video content has to be extended such that concept vocabularies are synchronized with current real-world events, systems can perform scalable concept learning with thousands of concepts, and high-level information such as sentiment can be extracted from visual content. To catch up with these demands the following three contributions are made in this thesis: (i) concept detection is linked to trending topics, (ii) visual learning from web videos is presented including the proper treatment of tags as concept labels, and (iii) the extension of concept detection with adjective noun pairs for sentiment analysis is proposed. In order for concept detection to satisfy users' current information needs, the notion of fixed concept vocabularies has to be reconsidered. This thesis presents a novel concept learning approach built upon dynamic vocabularies, which are automatically augmented with trending topics mined from social media. Once discovered, trending topics are evaluated by forecasting their future progression to predict high impact topics, which are then either mapped to an available static concept vocabulary or trained as individual concept detectors on demand. It is demonstrated in experiments on YouTube video clips that by a visual learning of trending topics, improvements of over 100% in concept detection accuracy can be achieved over static vocabularies (n=78,000). To remove manual efforts related to training data retrieval from YouTube and noise caused by tags being coarse, subjective and context-depedent, this thesis suggests an automatic concept-to-query mapping for the retrieval of relevant training video material, and active relevance filtering to generate reliable annotations from web video tags. Here, the relevance of web tags is modeled as a latent variable, which is combined with an active learning label refinement. In experiments on YouTube, active relevance filtering is found to outperform both automatic filtering and active learning approaches, leading to a reduction of required label inspections by 75% as compared to an expert annotated training dataset (n=100,000). Finally, it is demonstrated, that concept detection can serve as a key component to infer the sentiment reflected in visual content. To extend concept detection for sentiment analysis, adjective noun pairs (ANP) as novel entities for concept learning are proposed in this thesis. First a large-scale visual sentiment ontology consisting of 3,000 ANPs is automatically constructed by mining the web. From this ontology a mid-level representation of visual content – SentiBank – is trained to encode the visual presence of 1,200 ANPs. This novel approach of visual learning is validated in three independent experiments on sentiment prediction (n=2,000), emotion detection (n=807) and pornographic filtering (n=40,000). SentiBank is shown to outperform known low-level feature representations (sentiment prediction, pornography detection) or perform comparable to state-of-the art methods (emotion detection). Altogether, these contributions extend state-of-the-art concept detection approaches such that concept learning can be done autonomously from web videos on a large-scale, and can cope with novel semantic structures such as trending topics or adjective noun pairs, adding a new dimension to the understanding of video content.

Damian Borth | Damian Borth | Damian Borth

[1] Marc Cheong,et al. Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base , 2009, CIKM-SWSM.

[2] Martha Larson,et al. Intent and its discontents: the user at the wheel of the online video search engine , 2012, ACM Multimedia.

[3] Andrew W. Fitzgibbon,et al. Efficient Object Category Recognition Using Classemes , 2010, ECCV.

[4] Yiannis Kompatsiaris,et al. SocialSensor: sensing user generated input for improved media discovery and experience , 2012, WWW.

[5] Cordelia Schmid,et al. Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[6] Daniel P. W. Ellis,et al. IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (MED) System , 2011, TRECVID.

[7] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[8] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9] Stefan M. Rüger,et al. Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation , 2005, CIVR.

[10] A. Atiya,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[11] Yueting Zhuang,et al. Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[12] Ramakant Nevatia,et al. Improving Part based Object Detection by Unsupervised, Online Boosting , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Bernt Schiele,et al. Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14] John Langford,et al. CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.

[15] Philip H. S. Torr,et al. Randomized trees for human pose detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Emine Yilmaz,et al. Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.

[18] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Milind R. Naphade,et al. A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[20] Cees Snoek,et al. Can social tagged images aid concept-based video search? , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[21] Rong Yan,et al. IBM multimedia search and retrieval system , 2007, CIVR '07.

[22] Marcel Worring,et al. Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[23] David S. Doermann,et al. Video retrieval using spatio-temporal descriptors , 2003, MULTIMEDIA '03.

[24] Brendan T. O'Connor,et al. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[25] R. Manmatha,et al. Using Maximum Entropy for Automatic Image Annotation , 2004, CIVR.

[26] Jie Tang,et al. Can we understand van gogh's mood?: learning to infer affects from images in social networks , 2012, ACM Multimedia.

[27] Jiebo Luo,et al. Large-scale multimodal semantic concept detection for consumer video , 2007, MIR '07.

[28] Andreas Dengel,et al. Meta-learning for evolutionary parameter optimization of classifiers , 2012, Machine Learning.

[29] Ming Yang,et al. Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.

[30] Paul Over,et al. Creating HAVIC: Heterogeneous Audio Visual Internet Collection , 2012, LREC.

[31] Dong Liu,et al. Tag ranking , 2009, WWW '09.

[32] Adrian Ulges,et al. Automatic concept-to-query mapping for web-based concept detector training , 2011, ACM Multimedia.

[33] Adrian Ulges,et al. Lookapp: interactive construction of web-based concept detectors , 2011, ICMR '11.

[34] Andrew McCallum,et al. Using Maximum Entropy for Text Classification , 1999 .

[35] HongJiang Zhang,et al. Motion Pattern-Based Video Classification and Retrieval , 2003, EURASIP J. Adv. Signal Process..

[36] Hao Su,et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[37] Stefan Winkler,et al. Emotion-based sequence of family photos , 2012, ACM Multimedia.

[38] Shih-Fu Chang,et al. Visual information retrieval from large distributed online repositories , 1997, CACM.

[39] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[40] Andreas Dengel,et al. Automatic classifier selection for non-experts , 2012, Pattern Analysis and Applications.

[41] Julián Andrada-Félix,et al. Exchange-rate forecasts with simultaneous nearest-neighbour methods: evidence from the EMS , 1999 .

[42] Chong-Wah Ngo,et al. Towards google challenge: combining contextual and social information for web video categorization , 2009, ACM Multimedia.

[43] Allan Hanbury,et al. Affective image classification using features inspired by psychology and art theory , 2010, ACM Multimedia.

[44] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[45] Koji Yatani,et al. Analysis of Adjective-Noun Word Pair Extraction Methods for Online Review Summarization , 2011, IJCAI.

[46] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[47] George Toderici,et al. Discriminative tag learning on YouTube videos with latent sub-tags , 2011, CVPR 2011.

[48] Ali Farhadi,et al. Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[49] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[50] Daniel Gatica-Perez,et al. PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[51] Luis von Ahn. Games with a Purpose , 2006, Computer.

[52] Jacob Ratkiewicz,et al. Traffic in Social Media I: Paths Through Information Networks , 2010, 2010 IEEE Second International Conference on Social Computing.

[53] Hermann Ney,et al. Discriminative training for object recognition using image patches , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[54] Markus Koch,et al. TubeTagger - YouTube-based Concept Detection , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[55] Bernardo A. Huberman,et al. The Pulse of News in Social Media: Forecasting Popularity , 2012, ICWSM.

[56] Bernardo A. Huberman,et al. Predicting the popularity of online content , 2008, Commun. ACM.

[57] P. Ekman. Facial expression and emotion. , 1993, The American psychologist.

[58] Rongrong Ji,et al. SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content , 2013, ACM Multimedia.

[59] Edward Y. Chang,et al. Support Vector Machine Concept-Dependent Active Learning for Image Retrieval , 2005 .

[60] Gang Wang,et al. TRECVID 2004 Search and Feature Extraction Task by NUS PRIS , 2004, TRECVID.

[61] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[62] Sharath Pankanti,et al. IBM Research and Columbia University TRECVID-2013 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), Surveillance Event Detection (SED), and Semantic Indexing (SIN) Systems , 2013, TRECVID.

[63] Tao Mei,et al. Contextual in-image advertising , 2008, ACM Multimedia.

[64] Arnold W. M. Smeulders,et al. Active learning using pre-clustering , 2004, ICML.

[65] Andrew Zisserman,et al. The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[66] Vladimir Vapnik,et al. The Nature of Statistical Learning , 1995 .

[67] Marcel Worring,et al. Unsupervised multi-feature tag relevance learning for social image retrieval , 2010, CIVR '10.

[68] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[69] Stéphane Ayache,et al. Evaluation of active learning strategies for video indexing , 2007, Signal Process. Image Commun..

[70] Marcel Worring,et al. MediaMill: Video Search using a Thesaurus of 500 Machine Learned Concepts , 2006, SAMT.

[71] James Ze Wang,et al. Tagging over time: real-world image annotation by lightweight meta-learning , 2007, ACM Multimedia.

[72] Adrian Ulges,et al. Fast Discriminative Linear Models for Scalable Video Tagging , 2009, 2009 International Conference on Machine Learning and Applications.

[73] Masoud Mazloom,et al. Querying for video events by semantic signatures from few examples , 2013, MM '13.

[74] Cordelia Schmid,et al. Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[75] Guojun Lu,et al. Content-based Image Retrieval Using Gabor Texture Features , 2000 .

[76] Alberto Del Bimbo,et al. Image retrieval by color semantics , 1999, Multimedia Systems.

[77] C. Schmid,et al. Scale-invariant shape features for recognition of object categories , 2004, CVPR 2004.

[78] Jie Tang,et al. Understanding the emotional impact of images , 2012, ACM Multimedia.

[79] Jianping Fan,et al. Personalized News Video Recommendation , 2009, MMM.

[80] Tobun Dorbin Ng,et al. Terrorism and Crime Related Weblog Social Network: Link, Content Analysis and Information Visualization , 2007, 2007 IEEE Intelligence and Security Informatics.

[81] Marcel Worring,et al. The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[82] Martha Larson,et al. Overview of VideoCLEF 2009: New Perspectives on Speech-based Multimedia Content Enrichment , 2009, CLEF.

[83] Marcel Worring,et al. VideOlympics: Real-Time Evaluation of Multimedia Retrieval Systems , 2008, IEEE MultiMedia.

[84] Miroslaw Bober,et al. MPEG-7 visual shape descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[85] Tao Mei,et al. Correlative multi-label video annotation , 2007, ACM Multimedia.

[86] C. Darwin. The Expression of the Emotions in Man and Animals , .

[87] Shih-Fu Chang,et al. Video search reranking via information bottleneck principle , 2006, MM '06.

[88] Franciska de Jong,et al. Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[89] Adrian Ulges,et al. Tag suggestion on youtube by personalizing content-based auto-annotation , 2012, CrowdMM '12.

[90] Mark Liberman,et al. Corpora for topic detection and tracking , 2002 .

[91] Rahul Malik,et al. VideoMule: a consensus learning approach to multi-label classification from noisy user-generated videos , 2009, MM '09.

[92] Nicolai Petkov,et al. Comparison of texture features based on Gabor filters , 2002, IEEE Trans. Image Process..

[93] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[94] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[95] Adrian Ulges,et al. A System That Learns to Tag Videos by Watching Youtube , 2008, ICVS.

[96] Christopher Joseph Pal,et al. YouTube Scale, Large Vocabulary Video Annotation , 2010, Video Search and Mining.

[97] Edoardo Ardizzone,et al. Video indexing using MPEG motion compensation vectors , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[98] Adrian Ulges,et al. Dynamic vocabularies for web-based concept detection by trend discovery , 2012, ACM Multimedia.

[99] David D. Lewis,et al. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[100] W. Bruce Croft,et al. Query expansion using local and global document analysis , 1996, SIGIR '96.

[101] Antonio Criminisi,et al. Harvesting Image Databases from the Web , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[102] Dong Wang,et al. Video diver: generic video indexing with diverse features , 2007, MIR '07.

[103] Adrian Ulges,et al. Visual Concept Learning from Weakly Labeled Web Videos , 2010, Video Search and Mining.

[104] Marijn ten Thij,et al. Modeling and predicting page-view dynamics on Wikipedia , 2012, ArXiv.

[105] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[106] Laura A. Dabbish,et al. Designing games with a purpose , 2008, CACM.

[107] Marcel Worring,et al. High-Performance Distributed Image and Video Content Analysis with Parallel-Horus , 2007 .

[108] David A. Forsyth,et al. Animals on the Web , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[109] Alan Hanjalic,et al. Shot-boundary detection: unraveled and resolved? , 2002, IEEE Trans. Circuits Syst. Video Technol..

[110] Michael P. Clements,et al. Chapter 1 Forecasting Annual UK Inflation Using an Econometric Model over 1875–1991 , 2008 .

[111] Max J. Egenhofer,et al. Query Processing in Spatial-Query-by-Sketch , 1997, J. Vis. Lang. Comput..

[112] de Franciska Jong,et al. OLIVE: Speech-Based Video Retrieval , 1998 .

[113] Andreas Dengel,et al. Automatic Detection of CSA Media by Multi-modal Feature Fusion for Law Enforcement Support , 2014, ICMR.

[114] Srinivasan H. Sengamedu,et al. vADeo: video advertising system , 2007, ACM Multimedia.

[115] Ramesh C. Jain,et al. Metadata in video databases , 1994, SGMD.

[116] Christos Faloutsos,et al. QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[117] Rong Yan,et al. How many high-level concepts will fill the semantic gap in news video retrieval? , 2007, CIVR '07.

[118] Jiri Matas,et al. Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[119] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[120] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[121] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[122] Michael Isard,et al. Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[123] Jens Lehmann,et al. DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[124] Milind R. Naphade,et al. Classification of video events using 4-dimensional time-compressed motion features , 2007, CIVR '07.

[125] Jun Yang,et al. (Un)Reliability of video concept detection , 2008, CIVR '08.

[126] Hermann Ney,et al. Bag-of-visual-words models for adult image classification and filtering , 2008, 2008 19th International Conference on Pattern Recognition.

[127] James Ze Wang,et al. Quest for relevant tags using local interaction networks and visual content , 2010, MIR '10.

[128] Adrian Ulges,et al. Detecting pornographic video content by combining image features with motion information , 2009, ACM Multimedia.

[129] Michael Brady,et al. Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[130] Yukinobu Taniguchi,et al. A novel region-based approach to visual concept modeling using web images , 2008, ACM Multimedia.

[131] Markus Koch,et al. TubeFiler: an automatic web video categorizer , 2009, ACM Multimedia.

[132] Bing Li,et al. Scaring or pleasing: exploit emotional impact of an image , 2012, ACM Multimedia.

[133] Adrian Ulges,et al. Identifying relevant frames in weakly labeled videos for training concept detectors , 2008, CIVR '08.

[134] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.

[135] Paul Over,et al. TRECVID: evaluating the effectiveness of information retrieval tasks on digital video , 2004, MULTIMEDIA '04.

[136] John R. Smith,et al. IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[137] Christoph H. Lampert,et al. Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[138] Ben Glocker,et al. Neighbourhood approximation using randomized forests , 2013, Medical Image Anal..

[139] Alexander G. Hauptmann,et al. The Use and Utility of High-Level Semantic Features in Video Retrieval , 2005, CIVR.

[140] Rong Yan,et al. Extreme video retrieval: joint maximization of human and computer performance , 2006, MM '06.

[141] B. S. Manjunath,et al. Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[142] Andreas Dengel,et al. ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[143] Jiebo Luo,et al. Aesthetics and Emotions in Images , 2011, IEEE Signal Processing Magazine.

[144] Adrian Ulges,et al. Pornography detection in video benefits (a lot) from a multi-modal approach , 2012, AMVA '12.

[145] Rainer Lienhart,et al. Reliable Transition Detection in Videos: A Survey and Practitioner's Guide , 2001, Int. J. Image Graph..

[146] John G. Breslin,et al. Enrichment and Ranking of the YouTube Tag Space and Integration with the Linked Data Cloud , 2009, SEMWEB.

[147] Stéphane Ayache,et al. Video Corpus Annotation Using Active Learning , 2008, ECIR.

[148] Marcel Worring,et al. Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[149] Yu He,et al. The YouTube video recommendation system , 2010, RecSys '10.

[150] Ullas Gargi,et al. Solving the label resolution problem in supervised video content classification , 2008, MIR '08.

[151] Andrew Zisserman,et al. Video Google: Efficient Visual Search of Videos , 2006, Toward Category-Level Object Recognition.

[152] Huanbo Luan,et al. Content-based video retrieval: Three example systems from TRECVid , 2008 .

[153] Alexander G. Hauptmann. Lessons for the Future from a Decade of Informedia Video Analysis Research , 2005, CIVR.

[154] Tao Mei,et al. Online video recommendation based on multimodal fusion and relevance feedback , 2007, CIVR '07.

[155] H. Varian,et al. Predicting the Present with Google Trends , 2009 .

[156] Marcel Worring,et al. Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[157] Rong Yan,et al. Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[158] M. Osborne,et al. Bieber no more : First Story Detection using Twitter and Wikipedia , 2012 .

[159] Bernardo A. Huberman,et al. Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[160] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[161] Chris Dyer,et al. Learning Semantics and Selectional Preference of Adjective-Noun Pairs , 2012, *SEM@NAACL-HLT.

[162] Hosung Park,et al. What is Twitter, a social network or a news media? , 2010, WWW '10.

[163] Shih-Fu Chang,et al. VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[164] A. Paivio,et al. LEARNING OF ADJECTIVE-NOUN PAIRED ASSOCIATES AS A FUNCTION OF ADJECTIVE-NOUN WORD ORDER AND NOUN ABSTRACTNESS. , 1963, Canadian journal of psychology.

[165] Trevor Cohn,et al. Trendminer: An Architecture for Real Time Analysis of Social Media Text , 2012, ICWSM 2012.

[166] Adrian Ulges,et al. Style modeling for tagging personal photo collections , 2009, CIVR '09.

[167] Rong Yan,et al. Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[168] Greg Schohn,et al. Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[169] Brian N. Bershad,et al. Why we search: visualizing and predicting user behavior , 2007, WWW '07.

[170] Jiebo Luo,et al. The wisdom of social multimedia: using flickr for prediction and forecast , 2010, ACM Multimedia.

[171] Howard D. Wactlar,et al. Putting active learning into multimedia applications: dynamic definition and refinement of concept classifiers , 2005, MULTIMEDIA '05.

[172] Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[173] Thomas G. Dietterich. An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[174] Roberto Cipolla,et al. Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[175] Roelof van Zwol,et al. Flickr tag recommendation based on collective knowledge , 2008, WWW.

[176] David S. Doermann,et al. Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[177] Alan F. Smeaton,et al. Large Scale Evaluations of Multimedia Information Retrieval: The TRECVid Experience , 2005, CIVR.

[178] Hila Becker,et al. Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[179] Yelena Yesha,et al. Keyframe-based video summarization using Delaunay clustering , 2006, International Journal on Digital Libraries.

[180] Georgia Koutrika,et al. Combating spam in tagging systems: An evaluation , 2008, TWEB.

[181] Yunfei Chen,et al. Evaluating the visual quality of web pages using a computational aesthetic approach , 2011, WSDM '11.

[182] Janyce Wiebe,et al. A Computational Theory of Perspective and Reference in Narrative , 1988, ACL.

[183] K. Scherer,et al. The Geneva affective picture database (GAPED): a new 730-picture database focusing on valence and normative significance , 2011, Behavior research methods.

[184] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[185] Andrew Zisserman,et al. Learning Visual Attributes , 2007, NIPS.

[186] Koen E. A. van de Sande,et al. A comparison of color features for visual concept classification , 2008, CIVR '08.

[187] Alberto Del Bimbo,et al. Tag suggestion and localization in user-generated videos based on social knowledge , 2010, WSM@MM.

[188] Paul Over,et al. High-level feature detection from video in TRECVid: a 5-year retrospective of achievements , 2009 .

[189] Markus Koch,et al. Content analysis meets viewers: linking concept detection with demographics on YouTube , 2012, International Journal of Multimedia Information Retrieval.

[190] Susan T. Dumais,et al. Modeling and predicting behavioral dynamics on the web , 2012, WWW.

[191] Adrian Ulges,et al. Relevance filtering meets active learning: improving web-based concept detectors , 2010, MIR '10.

[192] Janyce Wiebe,et al. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[193] Andreas Dengel,et al. Analysis and forecasting of trending topics in online media streams , 2013, ACM Multimedia.

[194] Stéphane Ayache,et al. TRECVID 2007: Collaborative Annotation using Active Learning , 2007, TRECVID.

[195] Yongdong Zhang,et al. Web video categorization based on Wikipedia categories and content-duplicated open resources , 2010, ACM Multimedia.

[196] Wei-Ying Ma,et al. Argo: intelligent advertising by mining a user's interest from his photo collections , 2009, KDD Workshop on Data Mining and Audience Intelligence for Advertising.

[197] David G. Lowe,et al. Object Class Recognition with Many Local Features , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[198] Hideyuki Tamura,et al. Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[199] Isabell M. Welpe,et al. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[200] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[201] Shih-Fu Chang,et al. To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[202] Jean-Marc Odobez,et al. A Thousand Words in a Scene , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[203] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[204] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[205] John R. Smith,et al. Modeling semantic concepts to support query by keywords in video , 2002, Proceedings. International Conference on Image Processing.

[206] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .

[207] David M. Pennock,et al. Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[208] Cees G. M. Snoek,et al. The MediaMill at TRECVID 2013: : Searching concepts, Objects, Instances and events in video , 2013, TRECVID.

[209] James Allan,et al. Introduction to topic detection and tracking , 2002 .

[210] Tao Mei,et al. Multi-Layer Multi-Instance Learning for Video Concept Detection , 2008, IEEE Transactions on Multimedia.

[211] Omar Alonso,et al. Hashtags as Milestones in Time , 2012 .

[212] Brian C. O'Connor,et al. Selecting Key Frames of Moving Image Documents: A Digital Environment for Analysis and Navigation. , 1991 .

[213] Bo Zhang,et al. A Formal Study of Shot Boundary Detection , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[214] Claire Cardie,et al. Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[215] Jin Zhao,et al. Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[216] Fei-Fei Li,et al. OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[217] Dong Wang,et al. The importance of query-concept-mapping for automatic video retrieval , 2007, ACM Multimedia.

[218] Jun Yang,et al. A framework for classifier adaptation and its applications in concept detection , 2008, MIR '08.

[219] Chong-Wah Ngo,et al. Exploring inter-concept relationship with context space for semantic video indexing , 2009, CIVR '09.

[220] Paul Over,et al. TRECVID 2009 -- Goals, Tasks, Data, Evaluation Mechanisms and Metrics | NIST , 2010 .

[221] Christian Petersohn. Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[222] Alexei A. Efros,et al. Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[223] Rong Yan,et al. Probabilistic latent query analysis for combining multiple retrieval sources , 2006, SIGIR.

[224] Avideh Zakhor,et al. Applications of Video-Content Analysis and Retrieval , 2002, IEEE Multim..

[225] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[226] Jianxiong Xiao,et al. What makes an image memorable , 2011 .

[227] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[228] J. Shotton,et al. Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2011 .

[229] Bu-Sung Lee,et al. Event Detection in Twitter , 2011, ICWSM.

[230] Shih-Fu Chang,et al. Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[231] Michael J. Black,et al. The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[232] Xirong Li,et al. Evaluating sources and strategies for learning video concepts from social media , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).

[233] Ryen W. White,et al. An implicit feedback approach for interactive information retrieval , 2006, Inf. Process. Manag..

[234] Alan Hanjalic,et al. An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis , 1999, IEEE Trans. Circuits Syst. Video Technol..

[235] B. S. Manjunath,et al. Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[236] James Ze Wang,et al. Content-based image retrieval: approaches and trends of the new age , 2005, MIR '05.

[237] Edward Y. Chang,et al. Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[238] Koen E. A. van de Sande,et al. Recommendations for video event recognition using concept vocabularies , 2013, ICMR.

[239] Qianhua He,et al. A survey on emotional semantic image retrieval , 2008, 2008 15th IEEE International Conference on Image Processing.

[240] Andrea Esuli,et al. SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[241] Markus Koch,et al. Linking visual concept detection with viewer demographics , 2012, ICMR '12.

[242] C. Osgood,et al. The Measurement of Meaning , 1958 .

[243] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[244] B. S. Manjunath,et al. Introduction to MPEG-7: Multimedia Content Description Interface , 2002 .

[245] Mike Thelwall,et al. Sentiment in short strength detection informal text , 2010 .

[246] Tao Mei,et al. SocialTransfer: cross-domain transfer learning from social streams for media applications , 2012, ACM Multimedia.

[247] Andrew Zisserman,et al. Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[248] Jing Huang,et al. Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[249] Roger Mohr,et al. A probabilistic framework of selecting effective key frames for video browsing and indexing , 2000 .

[250] Adrian Ulges,et al. Automatic detection of child pornography using color visual words , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[251] Chong-Wah Ngo,et al. VIREO/DVMM at TRECVID 2009: High-Level Feature Extraction, Automatic Video Search, and Content-Based Copy Detection , 2009, TRECVID.

[252] Jure Leskovec,et al. Patterns of temporal variation in online media , 2011, WSDM '11.

[253] Shih-Fu Chang,et al. Reranking Methods for Visual Search , 2007, IEEE MultiMedia.

[254] Meng Wang,et al. Automatic video annotation by semi-supervised learning with kernel density estimation , 2006, MM '06.

[255] Bernard J. Jansen,et al. Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[256] R. Plutchik. Emotion, a psychoevolutionary synthesis , 1980 .

[257] Shuicheng Yan,et al. Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[258] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.

[259] James Ze Wang,et al. Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[260] Lillian Lee,et al. Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[261] Rainer Lienhart,et al. Comparison of automatic shot boundary detection algorithms , 1998, Electronic Imaging.

[262] Chong-Wah Ngo,et al. Semantic context transfer across heterogeneous sources for domain adaptive video search , 2009, ACM Multimedia.

[263] Stefanos D. Kollias,et al. A stochastic framework for optimal key frame extraction from MPEG video databases , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[264] Ricardo Vilalta,et al. A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[265] Mohammad Soleymani,et al. The Community and the Crowd: Multimedia Benchmark Dataset Development , 2012, IEEE MultiMedia.

[266] Nicu Sebe,et al. In the eye of the beholder: employing statistical analysis and eye tracking for analyzing abstract paintings , 2012, ACM Multimedia.

[267] Nicu Sebe,et al. Emotional valence categorization using holistic image features , 2008, 2008 15th IEEE International Conference on Image Processing.

[268] Jeonghee Yi,et al. Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[269] Maarten de Rijke,et al. Content-Based Analysis Improves Audiovisual Archive Retrieval , 2012, IEEE Transactions on Multimedia.

[270] H. Bischof,et al. Evaluation of local detectors on non-planar scenes , 2004 .

[271] Eero Hyvönen,et al. Ontology-Based Image Retrieval , 2003, WWW.

[272] Marcel Worring,et al. The MediaMill TRECVID 2009 Semantic Video Search Engine , 2009, TRECVID.

[273] Rongrong Ji,et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs , 2013, ACM Multimedia.

[274] Marcel Worring,et al. Personalizing automated image annotation using cross-entropy , 2011, ACM Multimedia.

[275] Andrew McCallum,et al. Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[276] Markus Koch,et al. Learning automatic concept detectors from online video , 2010, Comput. Vis. Image Underst..

[277] Steven S. Beauchemin,et al. The computation of optical flow , 1995, CSUR.

[278] Ellen Riloff,et al. Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[279] Dragutin Petkovic,et al. Query by Image and Video Content: The QBIC System , 1995, Computer.

[280] Fei-Fei Li,et al. Video Event Understanding Using Natural Language Descriptions , 2013, 2013 IEEE International Conference on Computer Vision.

[281] Trevor Darrell,et al. Detection bank: an object detection based video representation for multimedia event recognition , 2012, ACM Multimedia.

[282] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[283] Gustavo Carneiro,et al. Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[284] Rob J Hyndman,et al. Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[285] Cor J. Veenman,et al. Robust Scene Categorization by Learning Image Statistics in Context , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[286] P. Bartlett,et al. Probabilities for SV Machines , 2000 .

[287] Chong-Wah Ngo,et al. Fusing semantics, observability, reliability and diversity of concept detectors for video search , 2008, ACM Multimedia.

[288] Marcel Worring,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Harvesting Social Images for Bi-Concept Search , 2022 .

[289] John R. Smith,et al. On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[290] Marcel Worring,et al. On the surplus value of semantic video analysis beyond the key frame , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[291] Gerard Salton,et al. Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[292] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[293] Marcel Worring,et al. Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[294] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[295] Weiqiang Wang,et al. Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training , 2009, PCM.

[296] Marcel Worring,et al. MediaMill: fast and effective video search using the forkbrowser , 2008, CIVR '08.

[297] Luc Van Gool,et al. SURF: Speeded Up Robust Features , 2006, ECCV.

[298] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[299] W. Chu. Studying Aesthetics in Photographic Images Using a Computational Approach , 2013 .

[300] David G. Stork,et al. Pattern classification, 2nd Edition , 2000 .

[301] Johan Bollen,et al. Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[302] Paul Over,et al. Video shot boundary detection: Seven years of TRECVid activity , 2010, Comput. Vis. Image Underst..

[303] Markus Koch,et al. Learning TRECVID'08 High-Level Features from YouTube , 2008, TRECVID.

[304] Alan Hanjalic,et al. Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[305] Chong-Wah Ngo,et al. Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[306] Adrian Ulges. Visual Concept Learning from User-tagged Web Video , 2009 .

[307] Benoit Huet,et al. Concept detector refinement using social videos , 2010, VLS-MCMR '10.

[308] D. Garc,et al. UC3M HIGH LEVEL FEATURE EXTRACTION AT TRECVID 2008 , 2008 .

[309] Shih-Fu Chang,et al. CU-VIREO 374 : Fusing Columbia 374 and VIREO 374 for Large Scale Semantic Concept Detection , 2008 .

[310] P. Roth,et al. SURVEY OF APPEARANCE-BASED METHODS FOR OBJECT RECOGNITION , 2008 .

[311] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[312] Lifeng Sun,et al. Propagation-based social-aware replication for social video contents , 2012, ACM Multimedia.

[313] Luciano Sbaiz,et al. Finding meaning on YouTube: Tag recommendation and category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[314] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[315] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[316] Alexander C. Berg,et al. Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[317] Fei-Fei Li,et al. Attribute Learning in Large-Scale Datasets , 2010, ECCV Workshops.

[318] Shaul Markovitch,et al. Similarity of Temporal Query Logs Based on ARIMA Model , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[319] Hermann Ney,et al. Features for image retrieval: an experimental comparison , 2008, Information Retrieval.

[320] Rong Yan,et al. A review of text and image retrieval approaches for broadcast news video , 2007, Information Retrieval.

[321] Adrian Ulges,et al. Keyframe Extraction for Video Tagging & Summarization , 2008, Informatiktage.

[322] George Pavlidis,et al. Methods for 3D digitization of Cultural Heritage , 2007 .

[323] R. Manmatha,et al. Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[324] Janyce Wiebe,et al. Identifying Collocations for Recognizing Opinions , 2001 .

[325] Wei Dai,et al. Joint categorization of queries and clips for web-based video search , 2006, MIR '06.

[326] Hrishikesh B. Aradhye,et al. Video2Text: Learning to Annotate Video Content , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[327] Tao Mei,et al. VideoSense: A Contextual In-Video Advertising System , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[328] David M. Pennock,et al. Predicting consumer behavior with Web search , 2010, Proceedings of the National Academy of Sciences.

[329] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[330] Mubarak Shah,et al. A holistic approach to aesthetic enhancement of photographs , 2011, TOMCCAP.

[331] Mubarak Shah,et al. Person-on-person violence detection in video data , 2002, Object recognition supported by user interaction for service robots.

[332] Tony Lindeberg,et al. Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[333] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[334] Pietro Perona,et al. Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[335] Damian Borth,et al. DFKI-IUPR participation in TRECVID'09 High-level Feature Extraction Task , 2009, TRECVID.

[336] Fabrice Souvannavong,et al. Latent semantic indexing for semantic content detection of video shots , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[337] Keiji Yanai,et al. Probabilistic web image gathering , 2005, MIR '05.

[338] Shih-Fu Chang,et al. Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[339] Theodoros Bozios,et al. Advanced Techniques for Personalized Advertising in a Digital TV Environment : The iMEDIA System , 2001 .

[340] Stefano Soatto,et al. Filtering Internet image search results towards keyword based category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[341] Rongrong Ji,et al. Weak attributes for large-scale image retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[342] Masoud Mazloom,et al. Searching informative concept banks for video event detection , 2013, ICMR.

[343] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[344] Mike Y. Chen,et al. Yahoo! For Amazon: Sentiment Parsing from Small Talk on the Web , 2001 .

[345] Adrian Ulges,et al. Navidgator - Similarity Based Browsing for Image and Video Databases , 2008, KI.

[346] Shih-Fu Chang,et al. Combining text and audio-visual features in video indexing , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[347] Meng Wang,et al. Semi-automatic video annotation based on active learning with multiple complementary predictors , 2005, MIR '05.

[348] Arnold W. M. Smeulders,et al. Visual-Concept Search Solved? , 2010, Computer.

[349] Gabriela Csurka,et al. Assessing the aesthetic quality of photographs using generic image descriptors , 2011, 2011 International Conference on Computer Vision.

[350] Mary Czerwinski,et al. Find Me the Right Content! Diversity-Based Sampling of Social Media Spaces for Topic-Centric Search , 2011, ICWSM.

[351] Li-Rong Dai,et al. Video Annotation by Active Learning and Cluster Tuning , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[352] Alan F. Smeaton. Techniques used and open challenges to the analysis, indexing and retrieval of digital video , 2007, Inf. Syst..

[353] Matthew Hurst,et al. BlogPulse: Automated Trend Discovery for Weblogs , 2003 .

[354] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[355] Shih-Fu Chang,et al. Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[356] Cor J. Veenman,et al. Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[357] Shree K. Nayar,et al. Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[358] Yihong Gong,et al. Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[359] Mathias Lux,et al. An Exploratory Study on the Explicitness of User Intentions in Digital Photo Retrieval , 2009 .