Estimating Conceptual Similarities Using Distributed Representations and Extended Backpropagation

The ability to perceive similarities and group entities into meaningful hierarchies is central to the processes of learning and generalisation. In artificial intelligence and data mining, the similarity of symbolic data has been estimated by techniques ranging from feature-matching and correlation analysis to Latent Semantic Analysis (LSA). One set of techniques that has received very little attention are those based upon cognitive models of similarity and concept formation. In this paper, we propose an extension to a neural network-based approach called Forming Global Representations with Extended backPropagation (FGREP), and show that it can be used to form meaningful conceptual clusters from information about an entity’s perceivable attributes or its usage and interactions. By examining these clusters, and their classification errors, we also show that the groupings identified by FGREP are more intuitive, and generalise better, than those formed using LSA.