Leveraging personal metadata for Desktop search: The Beagle++ system

Search on PCs has become less efficient than searching the Web due to the increasing amount of stored data. In this paper we present an innovative Desktop search solution, which relies on extracted metadata, context information as well as additional background information for improving Desktop search results. We also present a practical application of this approach-the extensible Beagle^+^+ toolbox. To prove the validity of our approach, we conducted a series of experiments. By comparing our results against the ones of a regular Desktop search solution - Beagle - we show an improved quality in search and overall performance.

[1]  Chong Wang,et al.  SPARK: Adapting Keyword Query to Semantic Search , 2007, ISWC/ASWC.

[2]  David R. Karger,et al.  Haystack: A Platform for Authoring End User Semantic Web Applications , 2003, WWW.

[3]  Claudia Niederée,et al.  Probabilistic Entity Linkage for Heterogeneous Information Spaces , 2008, CAiSE.

[4]  Siegfried Handschuh,et al.  Distributed Knowledge Representation on the Social Semantic Desktop: Named Graphs, Views and Roles in NRL , 2007, ESWC.

[5]  Xin Li,et al.  Tag-based social interest discovery , 2008, WWW.

[6]  Siegfried Handschuh,et al.  The NEPOMUK Project - On the way to the Social Semantic Desktop , 2007 .

[7]  Wolfgang Nejdl,et al.  Activity Based Metadata for Semantic Desktop Search , 2005, ESWC.

[8]  Sebastian Rudolph,et al.  Ontology-Based Interpretation of Keywords for Semantic Search , 2007, ISWC/ASWC.

[9]  Dmitri V. Kalashnikov,et al.  Domain-independent data cleaning via analysis of entity-relationship graph , 2006, TODS.

[10]  Lise Getoor,et al.  Deduplication and Group Detection using Links , 2004 .

[11]  Anderson Michael,et al.  AAAI Fall Symposium , 2005 .

[12]  Dmitri V. Kalashnikov,et al.  Exploiting Relationships for Domain-Independent Data Cleaning , 2005, SDM.

[13]  Andreas Harth,et al.  Optimized index structures for querying RDF from the Web , 2005, Third Latin American Web Congress (LA-WEB'2005).

[14]  Rich Salz,et al.  A Universally Unique IDentifier (UUID) URN Namespace , 2005, RFC.

[15]  Ramanathan V. Guha,et al.  Semantic search , 2003, WWW '03.

[16]  Peter W. Eklund,et al.  OntoRama: Browsing RDF ontologies using a hyperbolic-style browser , 2002, First International Symposium on Cyber Worlds, 2002. Proceedings..

[17]  Haofen Wang,et al.  Q2Semantic: A Lightweight Keyword Interface to Semantic Search , 2008, ESWC.

[18]  Martin Frank,et al.  The Social Semantic Desktop , 2004 .

[19]  Alon Halevy,et al.  SEMEX: Toward On-the-fly Personal Information Integration , 2004 .

[20]  Emmanuel Pietriga IsaViz, a Visual Environment for Browsing and Authoring RDF Models , 2002, WWW 2002.

[21]  Harith Alani TGVizTab: An Ontology Visualisation Extension for Protégé , 2003 .

[22]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[23]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[24]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[25]  Juan-Zi Li,et al.  A Unified Probabilistic Framework for Name Disambiguation in Digital Library , 2012, IEEE Transactions on Knowledge and Data Engineering.

[26]  Leo Sauermann,et al.  The Sesame LuceneSail : RDF Queries with Full-text Search NEPOMUK Technical Report 2008-1 , 2008 .

[27]  Jack Park,et al.  IRIS: Integrate. Relate. Infer. Share , 2005, Semantic Desktop Workshop.

[28]  Padhraic Smyth,et al.  Subject metadata enrichment using statistical topic models , 2007, JCDL '07.

[29]  Jayant Madhavan,et al.  Reference reconciliation in complex information spaces , 2005, SIGMOD '05.

[30]  Leo Sauermann,et al.  Introducing the gnowsis semantic desktop , 2004 .

[31]  Wolfgang Nejdl,et al.  The Beagle++ Toolbox: Towards an Extendable Desktop Search Architecture , 2006, SemDesk.

[32]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[33]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[35]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[36]  Amit P. Sheth,et al.  Context-Aware Semantic Association Ranking , 2003, SWDB.

[37]  Stephan Bloehdorn,et al.  TagFS - Tag Semantics for Hierarchical File Systems , 2006 .

[38]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[39]  Mor Naaman,et al.  Context data in geo-referenced digital photo collections , 2004, MULTIMEDIA '04.

[40]  Wolfgang Nejdl,et al.  Benchmarking Fulltext Search Performance of RDF Stores , 2009, ESWC.

[41]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[42]  Yimin Wu,et al.  Category-based search using metadatabase in image retrieval , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[43]  Dirk-Willem van Gulik,et al.  Indexing and retrieving Semantic Web resources: the RDFStore model , 2003 .

[44]  Mark S. Ackerman,et al.  The perfect search engine is not enough: a study of orienteering behavior in directed search , 2004, CHI.

[45]  Leo Sauermann,et al.  Using semantic web technologies to build a semantic desktop , 2003 .

[46]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[47]  Daniel Schwabe,et al.  A hybrid approach for searching in the semantic web , 2004, WWW '04.

[48]  Padhraic Smyth,et al.  Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning , 2008, SEMWEB.

[49]  Roy T. Fielding,et al.  Uniform Resource Identifiers (URI): Generic Syntax , 1998, RFC.

[50]  Eric Freeman,et al.  Lifestreams: Organizing your Electronic Life* , 1995 .