Identifying Inter-Domain Similarities Through Content-Based Analysis of Hierarchical Web-Directories

Providing accurate personalized information services to the users requires knowing their interests and needs, as defined by their User Models (UMs). Since the quality of the personalization depends on the richness of the UMs, services would benefit from enriching their UMs through importing and aggregating partial UMs built by other services from relatively similar domains. The obvious question is how to determine the similarity of domains? This paper proposes to compute inter-domain similarities by exploiting well-known Information Retrieval techniques for comparing textual contents of the Web-sites, classified under the domain nodes in Web-directories. Initial experiments validate feasibility of the proposed approach and raise open research questions.