Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web

Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.

[1]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[2]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[3]  Liwen Vaughan,et al.  New measurements for search engine evaluation proposed and tested , 2004, Inf. Process. Manag..

[4]  Michael D. Gordon,et al.  Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines , 1999, Inf. Process. Manag..

[5]  Jaideep Srivastava,et al.  First 20 precision among World Wide Web search services (search engines) , 1999 .

[6]  Louis B. Rosenfeld,et al.  Information architecture for the world wide web - designing large-scale web sites , 1998 .

[7]  Bernard J. Jansen,et al.  Coverage, relevance, and ranking: The impact of query operators on Web search engine results , 2003, TOIS.

[8]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[9]  Amanda Spink,et al.  Median measure: an approach to IR systems evaluation , 2001, Inf. Process. Manag..

[10]  Laura Gordon-Murnane Social bookmarking, folksonomies, and web 2.0 tools , 2006 .

[11]  Abbe Mowshowitz,et al.  Assessing bias in search engines , 2002, Inf. Process. Manag..

[12]  Howard B. Lee,et al.  Foundations of Behavioral Research , 1973 .

[13]  Amanda Spink,et al.  How are we searching the World Wide Web? A comparison of nine search engine transaction logs , 2006, Inf. Process. Manag..

[14]  Hugh C. Davis,et al.  Folksonomies versus Automatic Keyword Extraction: An Empirical Study , 2006 .

[15]  Owen Williams,et al.  Search Engine Watch , 2005 .

[16]  Wendy T. Lucas,et al.  Form and function: The impact of query term and operator usage on Web search results , 2002, J. Assoc. Inf. Sci. Technol..

[17]  Carol L. Barry,et al.  Order Effects: A Study of the Possible Influence of Presentation Order on User Judgments of Document Relevance. , 1988 .

[18]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[19]  Greg R. Notess,et al.  Diverging web markup choices , 2006 .

[20]  Jessica Dye,et al.  Folksonomy : A game of high-tech (and high-stakes) tag , 2006 .

[21]  Amanda Spink,et al.  A study of results overlap and uniqueness among major Web search engines , 2006, Inf. Process. Manag..

[22]  Ron Miller Get a grip : Strategies and insights for managing electronic records , 2006 .

[23]  Koichi Takeda,et al.  Information retrieval on the web , 2000, CSUR.

[24]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[25]  Olivia R. Liu Sheng,et al.  Analysis of the query logs of a Web site search engine , 2005, J. Assoc. Inf. Sci. Technol..

[26]  Stephen E. Robertson,et al.  On Collection Size and Retrieval Effectiveness , 2004, Information Retrieval.

[27]  David Bodoff,et al.  Relevance for browsing, relevance for searching , 2006, J. Assoc. Inf. Sci. Technol..

[28]  Robert M. Losee,et al.  Measuring search-engine quality and query difficulty: ranking with Target and Freestyle , 1999 .

[29]  Stefano Mizzaro,et al.  Relevance: The Whole History , 1997, J. Am. Soc. Inf. Sci..

[30]  Marieke Guy,et al.  Folksonomies: Tidying Up Tags? , 2006, D Lib Mag..

[31]  Rabia Nuray-Turan,et al.  Automatic performance evaluation of Web search engines , 2004, Inf. Process. Manag..

[32]  Peter Bailey,et al.  Measuring Search Engine Quality , 2001, Information Retrieval.

[33]  M. M. Sufyan Beg A subjective measure of web search quality , 2005, Inf. Sci..

[34]  Daniel Chudnov,et al.  Experiments in academic social book marking with Unalog , 2005, Libr. Hi Tech.

[35]  Michael Eisenberg,et al.  Order effects: A study of the possible influence of presentation order on user judgments of document relevance , 1988, J. Am. Soc. Inf. Sci..

[36]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[37]  Douglas A. Wolfe,et al.  Nonparametric Statistical Methods , 1973 .