Analyzing Methods of the Relation between Concepts based on a Concept Hierarchy

Data objects are usually organized hierarchically, and the relations between them are analyzed based on a corresponding concept hierarchy. The relation between data objects, for example how similar they are, are usually analyzed based on the conceptual distance in the hierarchy. If a node is an ancestor of another node, it is enough to analyze how close they are by calculating the distance vertically. However, if there is not such relation between two nodes, the vertical distance cannot express their relation explicitly. This paper tries to fill this gap by improving the analysis method for data objects based on hierarchy. The contributions of this paper include: (1) proposing an improved method to evaluate the vertical distance between concepts; (2) defining the concept horizontal distance and a method to calculate the horizontal distance; and (3) discussing the methods to confine a range by the horizontal distance and the vertical distance, and evaluating the relation between concepts. Keywords—Concept Hierarchy, Horizontal Distance, Relation Analysis, Vertical Distance

[1]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[2]  Johann Gasteiger,et al.  Hierarchical classification as an aid to database and hit-list browsing , 1994, CIKM '94.

[3]  José Hernández-Orallo,et al.  Hierarchical Distance-Based Conceptual Clustering , 2008, ECML/PKDD.

[4]  Shenghuo Zhu,et al.  Topic hierarchy generation via linear discriminant projection , 2003, SIGIR '03.

[5]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[6]  Thomas Hofmann,et al.  Text classification in a hierarchical mixture model for small training sets , 2001, CIKM '01.

[7]  Ke Lu,et al.  Analyzing Multi-Labeled Data Based on the Roll of a Concept against a Semantic Range , 2010 .

[8]  Andreas Nürnberger,et al.  Creating a Cluster Hierarchy under Constraints of a Partially Known Hierarchy , 2008, SDM.

[9]  Tetsuya Furukawa,et al.  Representation for Multiple Classified Data , 2006, Databases and Applications.

[10]  Chung-Chian Hsu,et al.  Incremental clustering of mixed data based on distance hierarchy , 2008, Expert Syst. Appl..

[11]  Yi Wang,et al.  Hierarchical Classification of Web Pages Using Support Vector Machine , 2008, ICADL.

[12]  Hakim Hacid,et al.  Using Semantic Distance in a Content-Based Heterogeneous Information Retrieval System , 2007, MCD.

[13]  Wanda Pratt,et al.  Better rules, fewer features: a semantic approach to selecting features from text , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.