Efficiency in generalized pipeline networks

A common architecture of most of today's super machines revolves around parallel or pipeline processing. Typical examples of such machines are the CDC STAR-100, TI-ASC, PEPE, IBM 360/91, 360/195, and CDC 6600, 7600, etc. They all have distinct pipeline processing capabilities, either in the form of internally pipelined arithmetic functional units or in the form of a pipeline of special purpose functional units. The principal idea behind pipelining is to create as much overlap as possible in the operations of the different facilities, for example, memory fetch unit, decoding units, adders, and multipliers. Concurrency of different operations increases the system utilization. As an important consequence of concurrency, the execution speed of most jobs are accelerated considerably as is evidenced in systems like 360/195 and TI-ASC. Ideally, in a pipelined machine, instead of obtaining one output per major cycle from the system, a rate of one output per minor cycle may be achievable. A typical linear pipeline is as drawn in Figure 1(a).