Understanding the retrieval effectiveness of collaborative tags and author keywords in different retrieval environments: An experimental study on medical collections

This study investigates the retrieval effectiveness of collaborative tags and author keywords in different environments through controlled experiments. Three test collections were built. The first collection tests the impact of tags on retrieval performance when only the title and abstract are available (the abstract environment). The second tests the impact of tags when the full text is available (the full‐text environment). The third compares the retrieval effectiveness of tags and author keywords in the abstract environment. In addition, both single‐word queries and phrase queries are tested to understand the impact of different query types. Our findings suggest that including tags and author keywords in indexes can enhance recall but may improve or worsen average precision depending on retrieval environments and query types. Indexing tags and author keywords for searching using phrase queries in the abstract environment showed improved average precision, whereas indexing tags for searching using single‐word queries in the full‐text environment led to a significant drop in average precision. The comparison between tags and author keywords in the abstract environment indicates that they have comparable impact on average precision, but author keywords are more advantageous in enhancing recall. The findings from this study provide useful implications for designing retrieval systems that incorporate tags and author keywords.

[1]  Wolfgang Nejdl,et al.  Can all tags be used for search? , 2008, CIKM '08.

[2]  Margaret E. I. Kipp,et al.  User, Author and Professional Indexing in Context: An Exploration of Tagging Practices on CiteULike / Le contexte de l'indexation des usagers, des créateurs et des professionnels : une exploration des pratiques d'étiquetage social sur CiteULike , 2011 .

[3]  Ellen M. Voorhees,et al.  The Text REtrieval Conference (TREC-2001) (10th, Gaithersburg, Maryland, November 13-16, 2001). NIST Special Publication. , 2000 .

[4]  Vittorio Loreto,et al.  Collaborative Tagging and Semiotic Dynamics , 2006, ArXiv.

[5]  Adolfo Alonso Arroyo,et al.  Keywords given by authors of scientific articles in database descriptors , 2007, J. Assoc. Inf. Sci. Technol..

[6]  C. Rockelle Strader Author-assigned keywords versus library of congress subject headings: Implications for the cataloging of electronic theses and dissertations , 2009 .

[7]  Tefko Saracevic,et al.  RELEVANCE: A review of and a framework for the thinking on the notion in information science , 1997, J. Am. Soc. Inf. Sci..

[8]  Fabio Crestani,et al.  A statistical comparison of tag and query logs , 2009, SIGIR.

[9]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[10]  Vittorio Loreto,et al.  Network properties of folksonomies , 2007, AI Commun..

[11]  Wooseob Jeong,et al.  Is Tagging Effective? - Overlapping Ratios with Other Metadata Fields , 2009, Dublin Core Conference.

[12]  John Riedl,et al.  tagging, communities, vocabulary, evolution , 2006, CSCW '06.

[13]  Meng Yang,et al.  Social bookmarking and exploratory search , 2007, ECSCW.

[14]  Vittorio Loreto,et al.  Semiotic dynamics and collaborative tagging , 2006, Proceedings of the National Academy of Sciences.

[15]  Andreas Hotho,et al.  A Comparison of Social Bookmarking with Traditional Search , 2008, ECIR.

[16]  Kun Lu,et al.  An experimental study on the retrieval effectiveness of collaborative tags , 2010, ASIST.

[17]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[18]  M. Naaman,et al.  Position Paper, Tagging, Taxonomy, Flickr, Article, ToRead , 2006 .

[19]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[20]  Margaret E. I. Kipp,et al.  Complementary or Discrete Contexts in Online Indexing : A Comparison of User, Creator, and Intermediary Keywords , 2013 .

[21]  Yusef Hassan-Montero,et al.  Improving Tag-Clouds as Visual Information Retrieval Interfaces , 2024, 2401.04947.

[22]  Anatoliy Gruzd Folksonomies vs. Bag-of-Words: The Evaluation & Comparison of Different Types of Document Representations , 2006 .

[23]  T. Saracevic,et al.  Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance , 2007, J. Assoc. Inf. Sci. Technol..

[24]  Carol L. Barry User-Defined Relevance Criteria: An Exploratory Study , 1994, J. Am. Soc. Inf. Sci..

[25]  Birger Hjørland,et al.  The foundation of the concept of relevance , 2010, J. Assoc. Inf. Sci. Technol..

[26]  Lizabeth Barclay,et al.  Tagging: People-Powered Metadata for the Social Web (Smith, G.; 2008) [Book Review] , 2009, IEEE Transactions on Professional Communication.

[27]  Margaret E. I. Kipp Tagging for Health Information Organization and Retrieval , 2008, Bull. IEEE Tech. Comm. Digit. Libr..

[28]  Marti A. Hearst,et al.  TREC 2007 Genomics Track Overview , 2007, TREC.

[29]  Mor Naaman,et al.  HT06, tagging paper, taxonomy, Flickr, academic article, to read , 2006, HYPERTEXT '06.

[30]  Margaret E. I. Kipp Searching with tags: Do tags help users find things? , 2009, ASIST.

[31]  Ed H. Chi,et al.  Towards a model of understanding social search , 2008, SSM '08.

[32]  Tony Hammond,et al.  Social Bookmarking Tools (I): A General Overview , 2005, D Lib Mag..

[33]  Georgia Koutrika,et al.  Can social bookmarking improve web search? , 2008, WSDM '08.

[34]  Cyril Cleverdon,et al.  The Cranfield tests on index language devices , 1997 .

[35]  Stephen E. Robertson,et al.  The TREC-8 Filtering Track Final Report , 1999, TREC.

[36]  Diane H. Sonnenwald,et al.  User perspectives on relevance criteria: A comparison among relevant, partially relevant, and not-relevant judgments , 2002, J. Assoc. Inf. Sci. Technol..

[37]  Howard Greisdorf,et al.  Relevance thresholds: a multi-stage predictive model of how users evaluate information , 2003, Inf. Process. Manag..

[38]  Jennifer Marlow,et al.  Flickr: a first look at user behaviour in the context of photography as serious leisure , 2008, Inf. Res..

[39]  Donna K. Harman,et al.  Overview of the Ninth Text REtrieval Conference (TREC-9) , 2000, Text Retrieval Conference.

[40]  Bradley M. Hemminger,et al.  Comparison of full-text searching to metadata searching for genes in two biomedical literature cohorts , 2007, J. Assoc. Inf. Sci. Technol..

[41]  Marieke Guy,et al.  Folksonomies: Tidying Up Tags? , 2006, D Lib Mag..

[42]  Isabella Peters,et al.  Folksonomies - Indexing and Retrieval in Web 2.0 , 2009, Knowledge and Information.

[43]  W. Bruce Croft,et al.  Indri : A language-model based search engine for complex queries ( extended version ) , 2005 .

[44]  Grace Thornton,et al.  Everything is Miscellaneous: The Power of the New Digital Disorder , 2009, J. Assoc. Inf. Sci. Technol..

[45]  Donna K. Harman,et al.  Overview of the Reliable Information Access Workshop , 2009, Information Retrieval.

[46]  Kristina Lerman,et al.  Social Browsing on Flickr , 2006, ICWSM.

[47]  David R. Millen,et al.  Dogear: Social bookmarking in the enterprise , 2006, CHI.

[48]  P. Jason Morrison,et al.  Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web , 2008, Inf. Process. Manag..

[49]  Margaret E. I. Kipp Tagging for health information organisation and retrieval , 2007, JCDL '07.

[50]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .