A predicative theory of machine languages and its application to compiler correctness

If computers are to behave correctly, it is important that the software for them be specified, developed, compiled, and executed correctly. At each step both informal and formal methods of software engineering can be applied to enhance reliability. The thesis looks at how formal methods can be applied to the compiling step. The problem is divided into three parts: the modeling of machine languages, the specification of code generators, and the derivation if code generators. All of this is done in the framework of predicative semantics. In predicative semantics, specification of computer behaviour--including programs--are considered to be predicates dividing acceptable from unacceptable behaviours. Thus a specification P refines a specification Q, if, when regarded as predicates, P implies Q over all behaviours. Machine languages are modeled using predicative semantics. We take the behaviours to be pairs representing the initial and final states of the machine (CPU and memory). While machines vary in the particulars of their state sets and instruction sets, there is much that can be said about machine languages in general. Thus we can develop a body of theory that does not depend on the details of the machine language. If source programs can be interpreted as predicates on pairs representing initial and final source level states, then code generation can be specified as follows: Given a representation relation that relates source and machine states, to each source level specification--including programs--there corresponds a machine-level specification. The task of a code generator is to output a machine program that refines the specification corresponding to its input source program. Using this specification of code generators, the next is to derive--in a provably correct fashion--a code generator from the semantics of a source language. The thesis develops and demonstrates techniques for doing this. Predicate logic is used throughout as a medium for expressing the the semantics machine languages, the semantics of source languages, and representation relations. It is also the tool used for reasoning about them. Wherever possible the theory is developed to be independent of particular machine languages, source languages, and representation relations.