How flickr helps us make sense of the world: context and content in community-contributed media collections

The advent of media-sharing sites like Flickr and YouTube has drastically increased the volume of community-contributed multimedia resources available on the web. These collections have a previously unimagined depth and breadth, and have generated new opportunities - and new challenges - to multimedia research. How do we analyze, understand and extract patterns from these new collections? How can we use these unstructured, unrestricted community contributions of media (and annotation) to generate "knowledge". As a test case, we study Flickr - a popular photo sharing website. Flickr supports photo, time and location metadata, as well as a light-weight annotation model. We extract information from this dataset using two different approaches. First, we employ a location-driven approach to generate aggregate knowledge in the form of "representative tags" for arbitrary areas in the world. Second, we use a tag-driven approach to automatically extract place and event semantics for Flickr tags, based on each tag's metadata patterns. With the patterns we extract from tags and metadata, vision algorithms can be employed with greater precision. In particular, we demonstrate a location-tag-vision-based approach to retrieving images of geography-related landmarks and features from the Flickr dataset. The results suggest that community-contributed media and annotation can enhance and improve our access to multimedia resources - and our understanding of the world.

[1]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[2]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[3]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  Kentaro Toyama,et al.  Geographic location tags on digital images , 2003, ACM Multimedia.

[7]  Mor Naaman,et al.  From Where to What: Metadata Sharing for Digital Photographs with Geographic Coordinates , 2003, OTM.

[8]  Marc Davis,et al.  Metadata creation system for mobile images , 2004, MobiSys '04.

[9]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[10]  Mor Naaman,et al.  Automatic organization for digital photographs with geographic coordinates , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[11]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[12]  Marc Gelgon,et al.  Organizing a personal image collection with statistical model-based ICL clustering on spatio-temporal camera phone meta-data , 2004, Journal of Visual Communication and Image Representation.

[13]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[14]  A. Smeaton,et al.  Combination of content analysis and context features for digital photograph retrieval. , 2005 .

[15]  Edward Y. Chang,et al.  Multimodal metadata fusion using causal strength , 2005, ACM Multimedia.

[16]  Edward Y. Chang,et al.  Extent: Inferring Image Metadata from Context and Content , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[17]  Fred Stentiford,et al.  Using context and similarity for face and location identification , 2006, Electronic Imaging.

[18]  Mor Naaman,et al.  Generating summaries and visualization for large collections of geo-referenced photographs , 2006, MIR '06.

[19]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[20]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[21]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.

[22]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[23]  Tamara L. Berg,et al.  Automatic Ranking of Iconic Images , 2007 .

[24]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[25]  Shih-Fu Chang,et al.  A reranking approach for context-based concept fusion in video indexing and retrieval , 2007, CIVR '07.

[26]  Mor Naaman,et al.  World explorer: visualizing aggregate data from unstructured text in geo-referenced collections , 2007, JCDL '07.

[27]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.