Limiting Tags Fosters Efficiency

Tagging facilitates information retrieval in social media and other online communities by allowing users to organize and describe online content. Researchers found that the efficiency of tagging systems steadily decreases over time, because tags become less precise in identifying specific documents, i.e., they lose their descriptiveness. However, previous works did not answer how or even whether community managers can improve the efficiency of tags. In this work, we use information-theoretic measures to track the descriptive and retrieval efficiency of tags on Stack Overflow, a question-answering system that strictly limits the number of tags users can specify per question. We observe that tagging efficiency stabilizes over time, while tag content and descriptiveness both increase. To explain this observation, we hypothesize that limiting the number of tags fosters novelty and diversity in tag usage, two properties which are both beneficial for tagging efficiency. To provide qualitative evidence supporting our hypothesis, we present a statistical model of tagging that demonstrates how novelty and diversity lead to greater tag efficiency in the long run. Our work offers insights into policies to improve information organization and retrieval in online communities.

[1]  Christoph Trattner,et al.  On the Navigability of Social Tagging Systems , 2010, 2010 IEEE Second International Conference on Social Computing.

[2]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[3]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[4]  Riccardo Miotto,et al.  Unsupervised mining of frequent tags for clinical eligibility text indexing , 2013, J. Biomed. Informatics.

[5]  Yang Zhang,et al.  Language in Our Time: An Empirical Analysis of Hashtags , 2019, WWW.

[6]  Yvonne Kammerer,et al.  Signpost from the masses: learning effects in an exploratory social tag search browser , 2009, CHI.

[7]  Dominik Kowald,et al.  Modeling Activation Processes in Human Memory to Predict the Use of Tags in Social Bookmarking Systems , 2016, J. Web Sci..

[8]  Dominik Kowald,et al.  Temporal Effects on Hashtag Reuse in Twitter: A Cognitive-Inspired Hashtag Recommendation Approach , 2017, WWW.

[9]  Chanchal Kumar Roy,et al.  Mining Duplicate Questions of Stack Overflow , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[10]  Zoran Budimac,et al.  Enhancing e-learning systems with personalized recommendation based on collaborative tagging techniques , 2018, Applied Intelligence.

[11]  Andreas Hotho,et al.  Mining Association Rules in Folksonomies , 2006, Data Science and Classification.

[12]  Dominik Benz,et al.  Stop thinking, start tagging: tag semantics emerge from collaborative verbosity , 2010, WWW '10.

[13]  Arkaitz Zubiaga,et al.  Tags vs shelves: from social tagging to social classification , 2011, HT '11.

[14]  Ed H. Chi,et al.  Information Seeking Can Be Social , 2009, Computer.

[15]  Ioannis Konstas,et al.  Categorising social tags to improve folksonomy-based recommendations , 2011, J. Web Semant..

[16]  Tobias Ley,et al.  Dynamics of human categorization in a collaborative tagging system: How social processes of semantic stabilization shape individual sensemaking , 2015, Comput. Hum. Behav..

[17]  Mor Naaman,et al.  Why do tagging systems work? , 2006, CHI Extended Abstracts.

[18]  Azhar Rauf,et al.  Semantics discovery in social tagging systems: A review , 2014, Multimedia Tools and Applications.

[19]  Hector Garcia-Molina,et al.  Social tag prediction , 2008, SIGIR '08.

[20]  Ralf Krestel,et al.  Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.

[21]  Lena Mamykina,et al.  Examining the impact of collaborative tagging on sensemaking in nutrition management , 2011, CHI.

[22]  Vittorio Loreto,et al.  The dynamics of correlated novelties , 2013, Scientific Reports.

[23]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[24]  Haoran Xie,et al.  Community-aware user profile enrichment in folksonomy , 2014, Neural Networks.

[25]  Ciro Cattuto,et al.  Semantic Grounding of Tag Relatedness in Social Bookmarking Systems , 2008, SEMWEB.

[26]  Yi-Cheng Zhang,et al.  Tag-Aware Recommender Systems: A State-of-the-Art Survey , 2011, Journal of Computer Science and Technology.

[27]  Lora Aroyo,et al.  Analyzing user behavior across social sharing environments , 2013, ACM Trans. Intell. Syst. Technol..

[28]  ZhouTao,et al.  Tag-aware recommender systems , 2011 .

[29]  Francesco Ricci,et al.  Cold-Start Management with Cross-Domain Collaborative Filtering and Tags , 2013, EC-Web.

[30]  Terry Anderson,et al.  Improving Search in Tag-Based Systems with Automatically Extracted Keywords , 2010, KSEM.

[31]  Denis Helic,et al.  Tag-Based Navigation and Visualization , 2018, Social Information Access.

[32]  Ciro Cattuto,et al.  Evaluating similarity measures for emergent semantics of social tagging , 2009, WWW '09.

[33]  Ingmar Weber,et al.  Characterizing the Demographics Behind the #BlackLivesMatter Movement , 2015, AAAI Spring Symposia.

[34]  Denis Helic,et al.  Building directories for social tagging systems , 2011, CIKM '11.

[35]  V. Roychowdhury,et al.  Re-inventing Willis , 2006, physics/0601192.

[36]  Jeffrey Heer,et al.  CommentSpace: structured support for collaborative visual analysis , 2011, CHI.

[37]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[38]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[39]  Ed H. Chi,et al.  Understanding the efficiency of social tagging systems using information theory , 2008, ICWSM.

[40]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[41]  Ebrahim Bagheri,et al.  Semantic tagging and linking of software engineering social content , 2014, Automated Software Engineering.

[42]  Ye Zhao,et al.  Supporting effective common ground construction in Asynchronous Collaborative Visual Analytics , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[43]  Robert West,et al.  How Constraints Affect Content: The Case of Twitter's Switch from 140 to 280 Characters , 2018, ICWSM.

[44]  Andreas Paepcke,et al.  Tagging human knowledge , 2010, WSDM '10.

[45]  G. Yule,et al.  A Mathematical Theory of Evolution Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[46]  Kaitlynn Mendes,et al.  #MeToo and the promise and pitfalls of challenging rape culture through digital feminist activism , 2018 .

[47]  Christoph Trattner,et al.  The impact of image descriptions on user tagging behavior: A study of the nature and functionality of crowdsourced tags , 2015, J. Assoc. Inf. Sci. Technol..