A Family of Data-Parallel Derivations

A good programmer attempts to minimize architecture-specific detail when writing a sequential implementation of an algorithm. Such good programming practice makes it possible to transport the implementation to other hardware architectures and thus minimize programmer effort. Indeed, high-level programming languages attempt to hide the detail required by particular machine architectures and thus make it easier to construct programs and transport them. When using traditional methods to implement an algorithm for an advanced parallel architecture, the programmer faces the dilemma of