Encoding Classifications into Lightweight Ontologies

Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of electronic information items. Classifications describe their contents using natural language labels, which has proved very effective in manual classification. However natural language labels show their limitations when one tries to automate the process, as they make it very hard to reason about classifications and their contents. In this paper we introduce the novel notion of Formal Classification, as a graph structure where labels are written in a propositional concept language. Formal Classifications turn out to be some form of lightweight ontologies. This, in turn, allows us to reason about them, to associate to each node a normal form formula which univocally describes its contents, and to reduce document classification to reasoning about subsumption.

[1]  Fausto Giunchiglia,et al.  Semantic Schema Matching , 2005, OTM Conferences.

[2]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[3]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[4]  Fausto Giunchiglia,et al.  Semantic Matching: Algorithms and Implementation , 2007, J. Data Semant..

[5]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[6]  Michael Uschold,et al.  Ontologies and semantics for seamless connectivity , 2004, SGMD.

[7]  Fausto Giunchiglia,et al.  Proceedings of the IJCAI-03 Workshop on Ontologies and Distributed Systems , 2003 .

[8]  Fausto Giunchiglia,et al.  Towards Explaining Semantic Matching , 2004, Description Logics.

[9]  I. Horrocks,et al.  The Instance Store: DL Reasoning with Large Numbers of Individuals , 2004, Description Logics.

[10]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .

[11]  Luciano Serafini,et al.  Semantic Coordination: A New Approach and an Application , 2003, SEMWEB.

[12]  Fausto Giunchiglia,et al.  Element Level Semantic Matching , 2004 .

[13]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[14]  Lois Mai Chan Dewey Decimal Classification: A Practical Guide , 1994 .

[15]  Natalya F. Noy,et al.  Semantic integration: a survey of ontology-based approaches , 2004, SGMD.

[16]  A Gordon,et al.  Classification, 2nd Edition , 1999 .

[17]  Diego Sona,et al.  Clustering documents in a web directory , 2003, WIDM '03.

[18]  Guus Schreiber,et al.  The Semantic Web – ISWC 2004 , 2004, Lecture Notes in Computer Science.

[19]  R. Wille Concept lattices and conceptual knowledge systems , 1992 .

[20]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[21]  P. Johnson-Laird Mental models , 1989 .

[22]  J. William Ahwood,et al.  CLASSIFICATION , 1931, Foundations of Familiar Language.

[23]  Luciano Serafini,et al.  Matching Hierarchical Classifications with Attributes , 2006, ESWC.

[24]  Luciano Serafini,et al.  Making Explicit the Semantics Hidden in Schema Models , 2003 .

[25]  Rocky Ross,et al.  Mental models , 2004, SIGA.

[26]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[27]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[28]  Fausto Giunchiglia,et al.  Efficient Semantic Matching , 2005, ESWC.

[29]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[30]  Fausto Giunchiglia,et al.  S-Match: an Algorithm and an Implementation of Semantic Matching , 2004, ESWS.

[31]  Luciano Serafini,et al.  Semantic Coordination of Hierarchical Classifications with Attributes , 2004 .