Formalizing the figural: aspects of a foundation for document manipulation

These are exciting times in the development of computational tools for document manipulation. The wealth of developments in text and graphics editors, page description languages, window systems, and document interchange languages, along with regular discussion in the daily newspapers, Sunday supplements, and PC magazines, is a sure sign that the technology once confined to the academic and industrial research laboratories has decisively entered the public realm. But if one steps back from the details of individual systems, features, competing claims and buzz words, one is led to notice that very little of this work, if any, is based on a sound theoretical understanding of the subject matter. That this is the case can be seen by analogy. Compiler construction is today guided by an impressive body of theory: formal language and finite automata theory, programming language semantics, and so on. Textbooks are written and courses taught presenting the whys and wherefores of compiler construction; significantly, this theory exists independently of the implementation details of particular compilers. But this was not always the case. There was a pretheoretical stage in the development of compilers when they were simply implemented by the seat of the pants, without a clear understanding of the foundation that was needed, or perhaps even an awareness that a theoretical foundation was desirable or achievable. We believe that document manipulation too is in its pretheoretical stage. Where are the courses on editor construction, for example? If such courses were to be offered today, what would they teach, other than the implementation details of particular systems? More generally, where are the theories that might guide the implementation of future systems and what would such theories look like?