Generalized weighted tree similarity algorithms for taxonomy trees

Taxonomy trees are used in machine learning, information retrieval, bioinformatics, and multi-agent systems for matching as well as matchmaking in e-business, e-marketplaces, and e-learning. A weighted tree similarity algorithm has been developed earlier which combines matching and missing values between two taxonomy trees. It is shown in this paper that this algorithm has some limitations when the same sub-tree appears at different positions in a pair of trees. In this paper, we introduce a generalized formula to combine matching and missing values. Subsequently, two generalized weighted tree similarity algorithms are proposed. The first algorithm calculates matching and missing values between two taxonomy trees separately and combines them globally. The second algorithm calculates matching and missing values at each level of the two trees and combines them at every level recursively which preserves the structural information between the two trees. The proposed algorithms efficiently use the missing value in similarity computation in order to distinguish among taxonomy trees that have the same matching value but with different miss trees at different positions. A set of synthetic weighted binary trees is generated and computational experiments are carried out that demonstrate the effects of arc weights, matching as well as missing values in a pair

[1]  Harold Boley,et al.  Range Similarity and Satisfaction Measures for Buyers and Sellers in E-marketplaces , 2008 .

[2]  Harold Boley,et al.  Compromise Matching in P2P e-Marketplaces: Concept, Algorithm and Use Case , 2011, MIWAI.

[3]  小嶋 秀樹,et al.  Computing lexical cohesion as a tool for text analysis , 1994 .

[4]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Harold Boley,et al.  A Weighted‐Tree Similarity Algorithm for Multi‐Agent Systems in E‐Business Environments , 2004, Comput. Intell..

[6]  Kaizhong Zhang,et al.  Exact and approximate algorithms for unordered tree matching , 1994, IEEE Trans. Syst. Man Cybern..

[7]  Harold Boley,et al.  A knowledge representation model for matchmaking systems in e-marketplaces , 2009, ICEC.

[8]  Harold Boley,et al.  WEIGHTED PARTONOMY-TAXONOMY TREES WITH LOCAL SIMILARITY MEASURES FOR SEMANTIC BUYER-SELLER MATCH-MAKING , 2005 .

[9]  Kaizhong Zhang,et al.  An Algorithm for Finding the Largest Approximately Common Substructures of Two Trees , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  L. Goldfarb,et al.  Inductive learning with the evolving tree transformation system , 1996 .

[11]  Franz Rothlauf,et al.  Evolution Strategies, Network Random Keys, and the One-Max Tree Problem , 2002, EvoWorkshops.

[12]  Harold Boley,et al.  Matchmaking in p2p e-marketplaces: soft constraints and compromise matching , 2010, ICEC '10.

[13]  Shin-Yee Lu A Tree-to-Tree Distance and Its Application to Cluster Analysis , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Matthias Jarke,et al.  Proceedings of the 12th International Conference on Electronic Commerce: Roadmap for the Future of Electronic Business , 2010, ICEC 2010.

[15]  Richard W. Hamming,et al.  Coding and Information Theory , 1980 .

[16]  Harold Boley,et al.  A match-making system for learners and learning objects , 2005, Interact. Technol. Smart Educ..

[17]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[18]  Pushpak Bhattacharyya,et al.  Using Semantic Information to Improve Case Retrieval in Case-Based Reasoning Systems , 2005 .

[19]  Hamed Kebriaei,et al.  A NEW AGENT MATCHING SCHEME USING AN ORDERED FUZZY SIMILARITY MEASURE AND GAME THEORY , 2008, Comput. Intell..

[20]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[21]  Hong Liu,et al.  A mapping-based tree similarity algorithm and its application to ontology alignment , 2014, Knowl. Based Syst..

[22]  Umberto Straccia,et al.  Information retrieval and machine learning for probabilistic schema matching , 2005, CIKM '05.

[23]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..