DART: the distributed agent-based retrieval toolkit

The technology of search engines is evolving from indexing and classification of web resources based on keywords to more sophisticated techniques which take into account the meaning and the context of textual information and usage. Replying to query, commercial search engines face the user requests with a large amount of results, mostly useless or only partially related to the request; the subsequent refinement, operated downloading and examining as much pages as possible and simply ignoring whatever stays behind the first few pages, is left up to the user. Furthermore, architectures based on centralized indexes, allow commercial search engines to control the advertisement of online information, in contrast to P2P architectures that focus the attention on user requirements involving the end user in search engine maintenance and operation. To address such wishes, new search engines should focus on three key aspects: semantics, geo-referencing, collaboration/distribution. Semantic analysis lets to increase the results relevance. The geo-referencing of catalogued resources allows contextualisation based on user position. Collaboration distributes storage, processing, and trust on a world-wide network of nodes running on users' computers, getting rid of bottlenecks and central points of failures. In this paper, we describe the studies, the concepts and the solutions developed in the DART project to introduce these three key features in a novel search engine architecture.

[1]  Adele E. Howe,et al.  Using web helper agent profiles in query generation , 2003, AAMAS '03.

[2]  Daniel Schwabe,et al.  A hybrid approach for searching in the semantic web , 2004, WWW '04.

[3]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[4]  Sriram Ramabhadran,et al.  Prefix Hash Tree An Indexing Data Structure over Distributed Hash Tables , 2004, PODC 2004.

[5]  Krzysztof Walczak,et al.  AVE - Method for 3D Visualization of Search Results , 2003, ICWE.

[6]  Johannes Gehrke,et al.  Querying peer-to-peer networks using P-trees , 2004, WebDB '04.

[7]  S. G. Zhou PEER-TO-PEER BASED GIS WEB SERVICES , 2004 .

[8]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[9]  Ben Houston,et al.  A Simple 3D Visual Text Retrieval Interface , 2002 .

[10]  Mudhakar Srivatsa,et al.  Apoidea: A Decentralized Peer-to-Peer Architecture for Crawling the World Wide Web , 2003, Distributed Multimedia Information Retrieval.

[11]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[13]  Jamie Callan,et al.  DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[14]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[15]  Nicolas Bonnel,et al.  Meaning metaphor for visualizing search results , 2005, Ninth International Conference on Information Visualisation (IV'05).

[16]  Rajeev Sharma,et al.  Enabling collaborative geoinformation access and decision‐making through a natural, multimodal interface , 2005, Int. J. Geogr. Inf. Sci..

[17]  Young-Woo Seo,et al.  Learning user's preferences by analyzing Web-browsing behaviors , 2000, AGENTS '00.

[18]  S. McQueen,et al.  Multimedia Visualization of Massive Military Datasets , 2002 .

[19]  Boon Thau Loo,et al.  Distributed Web Crawling over DHTs , 2004 .

[20]  Alessandro Soro,et al.  Range-capable Distributed Hash Tables , 2006, GIR.

[21]  William Pugh,et al.  Skip Lists: A Probabilistic Alternative to Balanced Trees , 1989, WADS.

[22]  Davide Carboni,et al.  GeoPix: image retrieval on the geo web, from camera click to mouse click , 2006, Mobile HCI.

[23]  Stan Matwin,et al.  Text Classification Using WordNet Hypernyms , 1998, WordNet@ACL/COLING.

[24]  H. V. Jagadzsh Linear Clustering of Objects with Multiple Attributes , 1998 .

[25]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[26]  Umberto Straccia,et al.  Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the oMAP Framework , 2006, ESWC.

[27]  Shivanand Balram,et al.  Collaborative geographic information systems , 2006 .

[28]  Katia P. Sycara,et al.  WebMate: a personal agent for browsing and searching , 1998, AGENTS '98.

[29]  Shivanand Balram,et al.  Collaborative Geographic Information Systems: Origins, Boundaries, and Structures , 2006 .