Retrospective: a low-overhead coherence solution for multiprocessors with private cache memories

The cache coherence solution proposed in this paper, now referred to as “Illinois Protocol”, had its origin in my work on performance modeling and analysis of multiprocessors. Having completed the work on performance of multiprocessor interconnection networks in 1978 and first published in ISCA-1979 [1], I embarked upon extending the analysis to multiprocessors with cache memories. The analytical method used in [l] was readily extendible to more complex situations involving transactions between processors, caches, interconnection networks and memories. The interconnection network used was either crossbar or multi-stage Delta Network [l]. This analytical work I completed in Fall of 1980, and was later published in the Transactions on Compufers [2]. The focus of these two papers was interference in the interconnection network with or without cache and therefore cache coherence did not get much attention, neither did bus based systems. However, things changed in 1981 when I came across a research project that Ed Davidson and his students were conducting at Illinois. Davidson had built a multiprocessor, called AMP-1 using the microprocessor Motorola 6800 and a synchronous bus. This system was designed around 1977-78 time frame by Bob Horst and Roy Kravitz and is described in an ISCA-1980 paper [3]. Several others were involved in performance modeling and measurement of the AMP-l, notably Joel Emer and David Yen. This is when I thought I could use my multiprocessor analytical method of [2] for single bus multiprocessor systems. While the AMP-1 did not have private cache, it still raised my interest in analyzing a bus based system with private caches. In 1982, I was familiar with then prevalent microprocessor buses, namely Intel Multibus and Motorola VME Bus. I thought I should model a hypothetical multiprocessor system with caches and a bus like VME or Multibus. I was teaching a hardware lab that designed an interface for Multibus and as a result I was very cognizant of the lowest level details of its bus protocol. To model such a bus I needed to know various events that caused bus activities. A survey of literature found no bus based system with private caches. Most cache papers were directory based protocols and in addition the interconnection either was a crossbar or not mentioned. So I decided to just assume some arbitrary cache protocol. After all my goal was to provide a model and analytical method for such a system, not invent new cache protocols. As it happens with many innovations, they are often unplanned! So in Fall 1982, to make my analysis more realistic, I decided to define a cache protocol that had some practicality and low cost in relation to bus interfaces for Multibus. It was not very difficult to come up with a reasonable cache protocol. In that Fall of 1982, Marc Papamarcos started his M.S. Thesis under my supervision. He was very familiar with VME bus and Motorola microprocessors. So I asked him to work on a hardware implementation of this cache protocol for the Motorola VME bus. It so happens that the protocol is not directly implementable on either Multibus or VME bus without extending the capabilities of the bus in some way. However, we made sure that any modification to the bus were simple, practical and of low cost. Mark carried out a very detailed hardware design for the cache controller [4]. Mark also worked out details for implementing an indivisible read-modify-write for the Motorola 68K.