An Ontology Based Approach to Measuring the Semantic Similarity between Information Objects in Personal Information Collections

This paper introduces a semantic approach to personal information management, which employs natural language processing, ontologies and a vector space model to measure the semantic similarity between information objects in personal information collections. The approach involves natural language processing, named entity recognition, and information object integration. In particular, natural language processing is used to detect meaningful and semantically distinguishable information objects within collections of personal information. Then, the named entities are extracted from these information objects and their features (such as weight and category) are used to measure the semantic similarity between them. Further research includes using the semantic similarity measure developed to index and retrieve information objects in a semantic based system for personal information management.

[1]  Duc Truong Pham,et al.  Authoring environment for documentation development , 2001 .

[2]  Jacek Gwizdka,et al.  Personal information management , 2004, CHI EA '04.

[3]  Leo Sauermann,et al.  Gnowsis Adapter Framework: Treating Structured Data Sources as Virtual RDF Graphs , 2005, SEMWEB.

[4]  Vannevar Bush,et al.  As we may think , 1945, INTR.

[5]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[6]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[7]  Siegfried Handschuh,et al.  The NEPOMUK Project - On the way to the Social Semantic Desktop , 2007 .

[8]  Jayant Madhavan,et al.  Personal information management with SEMEX , 2005, SIGMOD '05.

[9]  Enrico Motta,et al.  The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings , 2005, SEMWEB.

[10]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[11]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[12]  Duc Truong Pham,et al.  Enhanced product support through intelligent product manuals , 2002, Int. J. Syst. Sci..

[13]  Jennifer Widom,et al.  The Lowell database research self-assessment , 2003, CACM.

[14]  Marcos Antonio,et al.  iMeMex: A Platform for Personal Dataspace Management , 2006 .

[15]  Gerhard Weikum,et al.  A Database Striptease or How to Manage Your Personal Databases , 2003, VLDB.

[16]  Susan T. Dumais,et al.  Searching to eliminate personal information management , 2006, CACM.

[17]  Ansgar Bernardi,et al.  Overview and Outlook on the Semantic Desktop , 2005, Semantic Desktop Workshop.