A Semantic-based Approach to Interoperabiltity of Classification Hierarchies: Evaluation of Linguistic Techniques

Classification Hierarchies (CHs) are widely used to organize documents in a way that makes their retrieval casier. Common examples of CHs are Web directories, marketplace catalogs, and file systems. In this paper we discuss and evaluate CtxMatch, an approach to interoperability that discovers mappings among CHs considering the semantic interpretation of their nodes. CtxMatch performs a linguistic processing of the labels attached to the nodes, including tokenization, Part of Speech tagging, multiword recognition and word sense disambiguation. We present an evaluation of the overall performance of the approach over Web directories as well as a systematic analysis of the linguistic modules involved.