Words and Pictures: Crowdsource Discovery beyond Image Semantics
暂无分享,去创建一个
Large annotated images from the Web and crowdsource, together with powerful machine learning tools, play a crucial role in rapid progress in semantic recognition of image data in recent years. However, as inferred in the saying "A Picture is Worth More Than 1,000 Words," there is much richer information than just semantic labels associated with images from the Web resources and Crowdsource Fora. Such additional information covers the rich unexploited aspects, such as visual aesthetics, emotions, sentiments, user intention, and knowledge structure. Discovering such novel dimensions of image descriptions beyond semantics will have huge impact for exciting emerging applications such as personalized search and content recommendation. But it requires rigorous research in concept definition, task formulation, data crawling, and evaluation mechanisms. In this talk, I will address these issues by sharing our experiences [1-4] in discovering beyond-semantic visual descriptions related to visual sentiment, video upload intent classification, cultural influence on visual sentiment, and finally a wiki-style video event ontology.
[1] Rongrong Ji,et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs , 2013, ACM Multimedia.
[2] Tao Chen,et al. Uploader Intent for Online Video: Typology, Inference, and Applications , 2015, IEEE Transactions on Multimedia.
[3] Dong Liu,et al. EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video , 2015, ACM Multimedia.
[4] Tao Chen,et al. Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology , 2015, ACM Multimedia.