Automated Metadata Hierarchy Derivation

This paper presents an automated approach for building a metadata hierarchy of a set of Web sites without the use of any predefined external hierarchies, and then merging and comparing them. The nodes of the hierarchy are the keywords of the specified Web sites, and the links between these keywords are the weak subsumption relationships. We apply this method in the RTGI project (Ghitalla et al., 2004) on clusters of Web sites already defined. The hierarchies can show how homogeneous each cluster is and permit to outline the contents of each corresponding cluster effectively. Moreover, we construct the common hierarchy of multiple clusters so that we check if their individual hierarchies are well distinguished and separated in the common one, which in turn indicates the correctness of clustering. At the end, we build the semantic-hypertextual graph of the sites which explains the semantic contents along with the topological structure of the sites

[1]  Steffen Staab,et al.  Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..

[2]  F. Ghitalla,et al.  TARENTe: an experimental tool for extracting and exploring Web aggregates , 2004, Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, 2004..

[3]  Thierry Hamon,et al.  A Step towards the Detection of Semantic Variants of Terms in Technical Documents , 1998, COLING-ACL.

[4]  Sophia Ananiadou,et al.  Automatic Discovery of Term Similarities Using Pattern Mining , 2002, COLING-02 on COMPUTERM 2002 second international workshop on computational terminology -.

[5]  Marti A. Hearst,et al.  Nearly-Automated Metadata Hierarchy Creation , 2004, NAACL.

[6]  Sara Rydin,et al.  Building a hyponymy lexicon with hierarchical structure , 2002, ACL 2002.

[7]  Sharon A. Caraballo Automatic construction of a hypernym-labeled noun hierarchy from text , 1999, ACL.

[8]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[9]  Dekang Lin,et al.  Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.

[10]  W. Bruce Croft,et al.  Discovering and Comparing Topic Hierarchies , 2000, RIAO.