Leveraging Massive User Contributions for Knowledge Extraction

The collective intelligence that emerges from the collaboration, competition, and co-ordination among individuals in social networks has opened up new opportunities for knowledge extraction. Valuable knowledge is stored and often “hidden” in massive user contributions, challenging researchers to find methods for leveraging these contributions and unfold this knowledge. In this chapter we investigate the problem of knowledge extraction from social media. We provide background information for knowledge extraction methods that operate on social media, and present three methods that use Flickr data to extract different types of knowledge namely, the community structure of tag-networks, the emerging trends and events in users tag activity, and the associations between image regions and tags in user tagged images. Our evaluation results show that despite the noise existing in massive user contributions, efficient methods can be developed to mine the semantics emerging from these data and facilitate knowledge extraction.

[1]  Christopher H. Brooks,et al.  Improved annotation of the blogosphere via autotagging and hierarchical clustering , 2006, WWW '06.

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  Christos Diou,et al.  Image annotation using clickthrough data , 2009, CIVR '09.

[4]  Tereza Iofciu,et al.  Finding Communities of Practice from User Profiles Based on Folksonomies , 2006, EC-TEL Workshops.

[5]  Myra Spiliopoulou,et al.  Spectral Clustering in Social-Tagging Systems , 2009, WISE.

[6]  Yiannis Kompatsiaris,et al.  Community Detection in Collaborative Tagging Systems , 2011, Community-Built Databases.

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.

[9]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[10]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[11]  Grigory Begelman,et al.  Automated Tag Clustering: Improving search and exploration in the tag space , 2006 .

[12]  Enrico Motta,et al.  Integrating Folksonomies with the Semantic Web , 2007, ESWC.

[13]  Mor Naaman,et al.  HT06, tagging paper, taxonomy, Flickr, academic article, to read , 2006, HYPERTEXT '06.

[14]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[15]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[16]  Eric Pardede Community-Built Databases - Research and Development , 2011 .

[17]  B. Taskar,et al.  Learning from ambiguously labeled images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Rossano Schifanella,et al.  Folks in Folksonomies: social link prediction from shared metadata , 2010, WSDM '10.

[19]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[20]  Bill Triggs,et al.  Region Classification with Markov Field Aspect Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Luc Steels,et al.  Augmenting Navigation for Collaborative Tagging with Emergent Semantics , 2006, International Semantic Web Conference.

[22]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[24]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[25]  Terrell Russell,et al.  cloudalicious: folksonomy over time , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[26]  Luc Van Gool,et al.  World-scale mining of objects and events from community photo collections , 2008, CIVR '08.

[27]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[28]  Steffen Staab,et al.  Semantic Multimedia , 2008, Reasoning Web.

[29]  Santanu Chaudhury,et al.  Learning ontology for personalized video retrieval , 2007, MS '07.

[30]  Dean Allemang,et al.  The Semantic Web - ISWC 2006, 5th International Semantic Web Conference, ISWC 2006, Athens, GA, USA, November 5-9, 2006, Proceedings , 2006, SEMWEB.

[31]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Valentin Robu,et al.  The complex dynamics of collaborative tagging , 2007, WWW '07.

[33]  Andreas Hotho,et al.  Trend Detection in Folksonomies , 2006, SAMT.

[34]  Hila Becker,et al.  Learning similarity metrics for event identification in social media , 2010, WSDM '10.

[35]  Antti Oulasvirta,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[36]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[37]  Yiannis Kompatsiaris,et al.  A Graph-Based Clustering Scheme for Identifying Related Tags in Folksonomies , 2010, DaWak.

[38]  Nigel Shadbolt,et al.  A Study of User Profile Generation from Folksonomies , 2008, SWKM.

[39]  Edwin Simpson,et al.  Clustering Tags in Enterprise and Web Folksonomies , 2021, ICWSM.

[40]  Daniel Dajun Zeng,et al.  Discovering Trends in Collaborative Tagging Systems , 2008, ISI Workshops.

[41]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[42]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[44]  James Allan,et al.  Extracting significant time varying features from text , 1999, CIKM '99.

[45]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[46]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[48]  William R. Hersh,et al.  Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries , 2002 .

[49]  Bamshad Mobasher,et al.  Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering , 2008, DaWaK.

[50]  Gottfried Vossen,et al.  Web Information Systems Engineering - WISE 2009, 10th International Conference, Poznan, Poland, October 5-7, 2009. Proceedings , 2009, WISE.

[51]  Mor Naaman,et al.  How flickr helps us make sense of the world: context and content in community-contributed media collections , 2007, ACM Multimedia.

[52]  GeversTheo,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010 .

[53]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[54]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[55]  Andreas Girgensohn,et al.  Temporal event clustering for digital photo collections , 2003, ACM Multimedia.

[56]  Yiannis Kompatsiaris,et al.  Clustering of Social Tagging System Users: A Topic and Time Based Approach , 2009, WISE.

[57]  Michael G. Strintzis,et al.  Still Image Segmentation Tools For Object-Based Multimedia Applications , 2004, Int. J. Pattern Recognit. Artif. Intell..

[58]  Yiannis Kompatsiaris,et al.  SEMSOC: SEMantic, SOcial and Content-Based Clustering in Multimedia Collaborative Tagging Systems , 2008, 2008 IEEE International Conference on Semantic Computing.

[59]  Toby Segaran,et al.  Programming Collective Intelligence , 2007 .

[60]  Yukinobu Taniguchi,et al.  A novel region-based approach to visual concept modeling using web images , 2008, ACM Multimedia.

[61]  Yiannis Kompatsiaris,et al.  Exploring temporal aspects in user-tag co-clustering , 2010, 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10.

[62]  Gerhard Weikum,et al.  Efficiently Handling Dynamics in Distributed Link Based Authority Analysis , 2008, WISE.

[63]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[64]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Yiannis Kompatsiaris,et al.  Co-Clustering Tags and Social Data Sources , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[66]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[67]  Hakim Hacid,et al.  Correlating Time-Related Data Sources with Co-clustering , 2008, WISE.

[68]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[69]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[70]  Nigel Shadbolt,et al.  Contextualising Tags in Collaborative Tagging Systems , 2009 .