This thesis introduces new methods for solving the problem of generalizing from relational data. I consider a situation in which we have a set of concepts and a set of relations among these concepts, and the data consists of few instances of these relations that hold among the concepts; the aim is to infer other instances of these relations. My approach is to learn from the data a representation of the objects in terms of the features which are relevant for the set of relations at hand, together with the rules of how these features interact. I then use these distributed representations to infer how objects relate.
Linear Relational Embedding (LRE) is a new method for learning distributed representations for objects and relations. It finds a mapping from the objects into a feature-space by imposing the constraint that relations in this feature-space are modelled by linear operations. Having learned such distributed representations, it becomes possible to learn a probabilistic model which can be used to infer both positive and negative instances of the relations.
LRE shows excellent generalization performance. On a classical problem results are far better than those obtained by any previously published method. I also present results on other problems, which show that the generalization performance of LRE is excellent. Moreover, after learning a distributed representation for a set of objects and relations, LRE can easily modify these representations to incorporate new objects and relations. Learning is fast and LRE rarely converges to solutions with poor generalization.
Due to its linearity LRE cannot represent some relations of arity greater than two. I therefore introduce Non-Linear Relational Embedding (NLRE), and show that it can represent relations that LRE cannot. A probabilistic model, which can be used to infer both positive and negative instances of the relations, can also be learned for NLRE. Hierarchical LRE and Hierarchical NLRE are modifications of the above methods for learning a distributed representation of variable-sized recursive data structures. Results show that these methods are able to extract semantic features from trees and use them to generalize correctly to novel trees.
[1]
Geoffrey E. Hinton,et al.
Learning Distributed Representations of Concepts Using Linear Relational Embedding
,
2001,
IEEE Trans. Knowl. Data Eng..
[2]
Peter W. Foltz,et al.
Learning Human-like Knowledge by Singular Value Decomposition: A Progress Report
,
1997,
NIPS.
[3]
J. Kruskal.
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
,
1964
.
[4]
Geoffrey E. Hinton,et al.
Learning Distributed Representations by Mapping Concepts and Relations into a Linear Space
,
2000,
ICML.
[5]
T. Landauer,et al.
A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge.
,
1997
.
[6]
Geoffrey E. Hinton,et al.
Learning distributed representations of concepts.
,
1989
.
[7]
Geoffrey E. Hinton,et al.
Extracting distributed representations of concepts and relations from positive and negative propositions
,
2000,
Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[8]
Forrest W. Young.
Multidimensional Scaling: History, Theory, and Applications
,
1987
.
[9]
Geoffrey E. Hinton,et al.
Learning Hierarchical Structures with Linear Relational Embedding
,
2001,
NIPS.