Logsonomy - social information retrieval with logdata

Social bookmarking systems constitute an established part of the Web 2.0. In such systems users describe bookmarks by keywords called tags. The structure behind these social systems, called folksonomies, can be viewed as a tripartite hypergraph of user, tag and resource nodes. This underlying network shows specific structural properties that explain its growth and the possibility of serendipitous exploration. Today's search engines represent the gateway to retrieve information from the World Wide Web. Short queries typically consisting of two to three words describe a user's information need. In response to the displayed results of the search engine, users click on the links of the result page as they expect the answer to be of relevance. This clickdata can be represented as a folksonomy in which queries are descriptions of clicked URLs. The resulting network structure, which we will term logsonomy is very similar to the one of folksonomies. In order to find out about its properties, we analyze the topological characteristics of the tripartite hypergraph of queries, users and bookmarks on a large snapshot of del.icio.us and on query logs of two large search engines. All of the three datasets show small world properties. The tagging behavior of users, which is explained by preferential attachment of the tags in social bookmark systems, is reflected in the distribution of single query words in search engines. We can conclude that the clicking behaviour of search engine users based on the displayed search results and the tagging behaviour of social bookmarking users is driven by similar dynamics.

[1]  M. Newman Random Graphs as Models of Networks , 2002, cond-mat/0202208.

[2]  Seungyeop Han,et al.  Analysis of topological characteristics of huge online social networking services , 2007, WWW '07.

[3]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[4]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[5]  M. Naaman,et al.  Position Paper, Tagging, Taxonomy, Flickr, Article, ToRead , 2006 .

[6]  Eytan Adar,et al.  User 4XXXXX9: Anonymizing Query Logs , 2007 .

[7]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[8]  Wei-Ying Ma,et al.  Optimizing web search using web click-through data , 2004, CIKM '04.

[9]  Vittorio Loreto,et al.  Vocabulary growth in collaborative tagging systems , 2007, ArXiv.

[10]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[11]  Algirdas Avizienis,et al.  Position Paper , 1994, EDCC.

[12]  X. Shi Social Network Analysis of Web Search Engine Query Logs , 2007 .

[13]  Timothy W. Finin,et al.  On the Structure, Properties and Utility of Internal Corporate Blogs , 2007, ICWSM.

[14]  Dell Zhang,et al.  A novel Web usage mining approach for search engines , 2002, Comput. Networks.

[15]  Peter A. Flach,et al.  Network analysis in natural sciences and engineering , 2007, AI Commun..

[16]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[17]  Valentin Robu,et al.  The Dynamics and Semantics of Collaborative Tagging , 2006, SAAW@ISWC.

[18]  Sergey N. Dorogovtsev,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW (Physics) , 2003 .

[19]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[20]  Ricardo A. Baeza-Yates,et al.  Extracting semantic relations from query logs , 2007, KDD '07.

[21]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[22]  Vittorio Loreto,et al.  Network properties of folksonomies , 2007, AI Commun..