Access to Relational Knowledge: A Comparison of Two Models

Access to Relational Knowledge: a Comparison of Two Models William H. Wilson (billw@cse.unsw.edu.au) Nadine Marcus (nadinem@cse.unsw.edu.au) School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales, 2052, Australia Graeme S. Halford (gsh@psy.uq.edu.au) School of Psychology, University of Queensland, Brisbane, Queensland, 4072, Australia Abstract If a person knows that Fred ate a pizza, then they can answer the following questions: Who ate a pizza?, What did Fred eat?, What did Fred do to the pizza? and even Who ate what? This and related properties we are terming accessibility properties for the relational fact that Fred ate a pizza. Accessibility in this sense is a significant property of human cognitive performance. Among neural network models, those employing tensor product networks have this accessibility property. While feedforward networks trained by error backpropagation have been widely studied, we have found no attempt to use them to model accessibility using backpropagation trained networks. This paper discusses an architecture for a backprop net that promises to provide some degree of accessibility. However, while limited forms of accessibility are achievable, the nature of the representation and the nature of backprop learning both entail limitations that prevent full accessibility. Studies of the degradation of accessibility with different sets of training data lead us to a rough metric for learning complexity of such data sets. directional access (cf. Halford, Wilson & Phillips, 1998). Omni-directional access is the ideal form of accessibility. Another reason for investigating this lies in the work of Halford, Wilson, and Phillips (e.g. 1998) which seeks in part to define a hierarchy of cognitive processes or systems and to draw parallels between this hierarchy and a second hierarchy of types of artificial neural networks. Levels 0 and 1 of this second hierarchy are 2- and 3-layer feedforward nets, and levels 2-5 are tensor product nets of increasing rank. It thus becomes interesting to consider how well feedforward nets can emulate tensor product networks. relation subject object outputs 3-dimensional arrayof binding units Introduction The purpose of this research is to determine whether a backpropagation net can be developed that processes propositions with the flexibility that is characteristic of certain classes of symbolic neural net models. This has arguably been difficult for backpropagation nets in the past. For example, the model of Rumelhart and Todd (1993) represents propositions such as canary can fly . Given the input canary, can it produces the output fly . However processing is restricted, so it cannot answer the question what can fly? ( canary ). There are, however, at least two types of symbolic nets that readily meet this requirement. One type of net model makes roles and fillers oscillate in synchrony (Hummel & Holyoak, 1997; Shastri & Ajjanagadde, 1993) while another is based on operations such as circular convolution (Plate, 2000) or tensor products (Halford, et al., 1994; 1998; Smolensky, 1990). These models appear to have greater flexibility than models based on backpropagation nets, in that they can be queried for any component of a proposition. We will refer to this property of tensor product nets as omni- relation subject object inputs Figure 1 – Tensor product network of rank 3. As tensor product networks are not as well known as feedforward networks, we shall describe them and their accessibility properties briefly here before proceeding. Tensor product networks are described in more detail, and from our point of view, in Halford et al. (1994). Briefly, a rank k tensor product network consists of a k- dimensional array of binding units , together with k input/output vectors. For example, a rank 2 tensor product network is a matrix, plus 2 input/output vectors. To teach the network to remember a fact (that is, a k- tuple), the input/output vectors are set to be vectors representing the components of the k-tuple, and a computation is performed that alters the k-dimensional array. Subsequently that fact can be accessed in a variety of ways. It is common to interpret the first