Constraint propagation techniques for theory-driven data interpretation (artificial intelligence, machine learning)

This dissertation defines the task of theory-driven data interpretation (TDDI) and investigates the adequacy of constraint propagation techniques for performing it. Data interpretation is the process of applying a given theory T (possibly a partial theory) to interpret observed facts F and infer a set of plausible initial conditions C such that from C and T one can deduce F. Most existing data interpretation programs do not employ an explicit theory T, but rather use some algorithm that embodies T. A program performs theory-driven data interpretation if it can accept an explicit theory at run time and employ it to perform data interpretation. The method of local propagation of constraints is investigated as a technique for implementing TDDI programs. An empirical study of the adequacy of constraint propagation is described. In the study, a constraint propagation program called PRE was constructed and tested on a set of 22 data interpretation problems. These problems arise as a subtask of the task of forming theories about the UNIX operating system. For UNIX, theories are expressed as computer programs, and TDDI involves the reverse-execution of these programs to determine what inputs could have given rise to the observed outputs. The architecture and implementation details of PRE are described and several key issues affecting the implementation of constraint propagation programs are identified. Constraint propagation techniques are found to be adequate for the UNIX task if (a) the representation for theories is augmented to include certain invariant facts about coordinated data structures and (b) the constraint propagation program is given a simplified algebra for reasoning about the primitive operations in the theory. In general, constraint propagation is adequate for TDDI if (a) the primitive predicates of the theory are invertible, (b) a tractable algebra for reformulating constraint loops is available, (c) all noise is removed from the data, and (d) the theory does not contain implicit global invariants.