Semantic Stability and Implicit Consensus in Social Tagging Streams

One potential disadvantage of social tagging systems is that due to the lack of a centralized vocabulary, a crowd of users may never manage to reach a consensus on the description of resources (e.g., books, images, users, or songs) on the Web. Yet, previous research has provided interesting evidence that the tag distributions of resources in social tagging systems may become semantically stable over time, as more and more users tag them and implicitly agree on the relative importance of tags for a resource. At the same time, previous work has raised an array of new questions such as: 1) how can we assess semantic stability in a robust and methodical way? 2) does the semantic stabilization varies across different social tagging systems and ultimately, and 3) what are the factors that can explain semantic stabilization in such systems? In this work, we tackle these questions by: 1) presenting a novel and robust method, which overcomes a number of limitations in existing methods; 2) empirically investigating semantic stabilization in different social tagging systems with distinct domains and properties; and 3) detecting potential causes of stabilization and implicit consensus, specifically imitation behavior, shared background knowledge and intrinsic properties of natural language. Our results show that tagging streams that are generated by a combination of imitation dynamics and shared background knowledge exhibit faster and higher semantic stability than tagging streams that are generated via imitation dynamics or natural language phenomena alone.

[1]  Laurence D. Stephens,et al.  Studies on Zipf's law , 1984 .

[2]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[3]  Dietmar Plenz,et al.  powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions , 2013, PloS one.

[4]  Enrico Motta,et al.  Integrating Folksonomies with the Semantic Web , 2007, ESWC.

[5]  C. Cattuto Semiotic dynamics in online social communities , 2006 .

[6]  Wai-Tat Fu,et al.  Semantic imitation in social tagging , 2010, TCHI.

[7]  Andreas Hotho,et al.  Mining Association Rules in Folksonomies , 2006, Data Science and Classification.

[8]  G. Yule,et al.  A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[9]  Wentian Li,et al.  Random texts exhibit Zipf's-law-like word frequency distribution , 1992, IEEE Trans. Inf. Theory.

[10]  Vittorio Loreto,et al.  Semiotic dynamics and collaborative tagging , 2006, Proceedings of the National Academy of Sciences.

[11]  Ramon Ferrer-i-Cancho,et al.  Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution , 2010, PloS one.

[12]  G. A. Miller,et al.  Finitary models of language users , 1963 .

[13]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .

[14]  Alistair Moffat,et al.  A similarity measure for indefinite rankings , 2010, TOIS.

[15]  Peter Mika,et al.  Ontologies are us: A unified model of social networks and semantics , 2005, J. Web Semant..

[16]  John Riedl,et al.  tagging, communities, vocabulary, evolution , 2006, CSCW '06.

[17]  George Macgregor,et al.  Collaborative tagging as a knowledge organisation and resource discovery tool , 2006 .

[18]  Margaret E. I. Kipp,et al.  Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices , 2007, ASIST.

[19]  Bing He,et al.  The dynamic features of Delicious, Flickr, and YouTube , 2011, J. Assoc. Inf. Sci. Technol..

[20]  Vittorio Loreto,et al.  Collaborative Tagging and Semiotic Dynamics , 2006, ArXiv.

[21]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[22]  Claudia Wagner,et al.  Religious Politicians and Creative Photographers: Automatic User Categorization in Twitter , 2013, 2013 International Conference on Social Computing.

[23]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[24]  Harry Halpin,et al.  The role of tag suggestions in folksonomies , 2009, HT '09.

[25]  Rosario N. Mantegna,et al.  Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts , 1997 .

[26]  Peter M. Todd,et al.  Can simple social copying heuristics explain tag popularity in a collaborative tagging system? , 2013, WebSci.

[27]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[28]  G. Yule,et al.  A Mathematical Theory of Evolution Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[29]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[30]  Omer Tripp,et al.  Zipf ’ s Law Revisited , 2007 .

[31]  Marcelo A. Montemurro,et al.  Frequency-rank distribution of words in large text samples: phenomenology and models , 2002, Glottometrics.

[32]  Steffen Staab,et al.  PINTS: peer-to-peer infrastructure for tagging systems , 2008, IPTPS.

[33]  Valentin Robu,et al.  The complex dynamics of collaborative tagging , 2007, WWW '07.

[34]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[35]  Ramon Ferrer i Cancho,et al.  The small world of human language , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[36]  Arkaitz Zubiaga,et al.  Tags vs shelves: from social tagging to social classification , 2011, HT '11.

[37]  Christopher T. Kello,et al.  Scaling laws in cognitive sciences , 2010, Trends in Cognitive Sciences.