Individuals, relations and structures in probabilistic models

Relational data is equivalent to non-relational structured data. It is this equivalence which permits probabilistic models of relational data. Learning of probabilistic models for relational data is possible because one item of structured data is generally equivalent to many related data items. Succession and inclusion are two relations that have been well explored in the statistical literature. A description of the relevant statistical approaches is given. The representation of relational data via Bayesian nets is examined, and compared with PRMs. The paper ends with some cursory remarks on structured objects. 1 Learning from iid samples Recall from[Cussens, 2000 ], the well-known correspondence between the mathematical abstractions used in statistics and the real world. This correspondence is given diagrammatically in Figure 1. This view sees Nature as a machine which probabilistically spits out data in response to questions (inputs) that we give it. In some cases (e.g. clustering, density estimation) the independent variables do not play an important role—the machine does not require any input to produce an output. This probabilistic machine has many names in the literature, it is Hacking’s “chance set-up” [Hacking, 1965] and Popper’s “generating conditions” [Popper, 1983 ]. This probabilistic machine is often taken to produce output by selecting its output from some population of possible outputs. Such a reconceptualisation is sometimes strained: “But only excessive metaphor makes outcomes of every chance set-up into samples from an hypothetical population” [Hacking, 1965, p. 25]. But it is pretty much hard-coded into the standard Kolmogorovian formalisation of probability. Kolmogorov’s axiomatisation defines a probabilistic model to be a probability space(Ω,F , P ). HereΩ is the population, and outputs (actually subsets of Ω in F) are chosen according to P . In standard approaches to statistical inference (or ‘learning’; the terms will be used interchangeably in this paper) we assume that the observed data is composed of independent and identically distributed (iid) items sampled fromΩ. The homogeneity of such data permits estimation of P . To Machine Nature Model Unknown