A Framework of Online Community based Expertise Information Retrieval on Grid Status of This Document

Web-based online communities such as blogs, forums and scientific communities have become important places for people to seek and share expertise. Search engines such as Google, Yahoo!, Live etc. are not yet capable to address queries that require deep semantic understanding of the query or the document. Instead, it may be preferable to find and ask someone who has related expertise or experience on a topic. Web-based online communities are the places people often seek advice or help. Before an analysis of search capabilities for these communities can be done, we need to gather the data (questions and answers, social support or discussion, comments or advice, content rating, social relations, and so forth) that describe the communities. There is no universal standard data structure for the outline of user participation in these communities. Also, as these communities rarely interoperate, each typically only has access to its own social data and cannot benefit from other communities’ data. Extracting, aggregating and analyzing data from these communities for finding experts on a single framework is a challenging task. In this document, we present a Grid-enabled framework of expertise search (GREFES) engine, which utilizes online communities as sources for experts on various topics. We suggest an open data structure called SNML (Social Network Markup Language) to outline user participation in online communities. The architecture addresses major challenges in crawling of community data and query processing by utilizing the computational power and high bandwidth inherently available in the Grid. Our framework supports open APIs for third party providers or developers to build new solutions in order to get more user feedback to improve the system. GFD-I.164 January 14, 2010

[1]  Edward A. Fox,et al.  SIGIR'95, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Seattle, Washington, USA, July 9-13, 1995 (Special Issue of the SIGIR Forum) , 1995 .

[2]  Lynn A. Streeter,et al.  Who Knows: A System Based on Automatic Representation of Semantic Structure , 1988, RIAO Conference.

[3]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[4]  Danah Boyd,et al.  Vizster: visualizing online social networks , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[5]  Luo Si,et al.  A language modeling framework for resource selection and results merging , 2002, CIKM '02.

[6]  Jens Vigen,et al.  Project GRACE: A grid based search tool for the global digital library , 2004 .

[7]  Stephen Farrell,et al.  Harvesting with SONAR: the value of aggregating social network information , 2008, CHI.

[8]  Yunhao Liu,et al.  EOS: expertise oriented search using social networks , 2007, WWW '07.

[9]  Berkant Barla Cambazoglu,et al.  Architecture of a grid-enabled Web search engine , 2007, Inf. Process. Manag..

[10]  Tim O'Reilly,et al.  What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software , 2007 .

[11]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[12]  Sebastiano Vigna,et al.  UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..

[13]  Berkant Barla Cambazoglu,et al.  Data-Parallel Web Crawling Models , 2004, ISCIS.

[14]  Mark S. Ackerman,et al.  Expertise recommender: a flexible recommendation system and architecture , 2000, CSCW '00.

[15]  Hector Garcia-Molina,et al.  The Evolution of the Web and Implications for an Incremental Crawler , 2000, VLDB.

[16]  Jörg Sander,et al.  Analysis of SIGMOD's co-authorship graph , 2003, SGMD.

[17]  Bruce Krulwich,et al.  The ContactFinder Agent: Answering Bulletin Board Questions with Referrals , 1996, AAAI/IAAI, Vol. 1.

[18]  Eugene Agichtein,et al.  Finding the right facts in the crowd: factoid question answering over social media , 2008, WWW.

[19]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[20]  Özgür Ulusoy,et al.  Exploiting interclass rules for focused crawling , 2004, IEEE Intelligent Systems.

[21]  Kathleen M. Carley,et al.  DyNetML: A Robust Interchange Language for Rich Social Network Data 1 , 2005 .

[22]  Mark S. Ackerman,et al.  Searching for expertise in social networks: a simulation of potential strategies , 2005, GROUP.

[23]  Leonard N. Foner,et al.  Yenta: a multi-agent, referral-based matchmaking system , 1997, AGENTS '97.

[24]  Christos Faloutsos,et al.  Parallel crawling for online social networks , 2007, WWW '07.

[25]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[26]  Berkant Barla Cambazoglu,et al.  Performance of query processing implementations in ranking-based text retrieval systems using inverted indices , 2006, Inf. Process. Manag..

[27]  Yiqun Liu,et al.  Finding Experts Using Social Network Analysis , 2007 .

[28]  A. Smeaton,et al.  Analysis of papers from twenty-five years of SIGIR conferences: what have we been doing for the last quarter of a century? , 2002, SIGF.

[29]  Volker Wulf,et al.  Sharing Expertise: Beyond Knowledge Management , 2002 .

[30]  Norman W. Paton,et al.  Web Services Data Access and Integration – The Core (WS-DAI) Specification , 2012 .

[31]  Ricardo A. Baeza-Yates,et al.  Crawling a country: better strategies than breadth-first for web page ordering , 2005, WWW '05.