Programming the Intel 80-core network-on-a-chip Terascale Processor

Intel's 80-core terascale processor was the first generally programmable microprocessor to break the Teraflops barrier. The primary goal for the chip was to study power management and on-die communication technologies. When announced in 2007, it received a great deal of attention for running a stencil kernel at 1.0 single precision TFLOPS while using only 97 Watts. The literature about the chip, however, focused on the hardware, saying little about the software environment or the kernels used to evaluate the chip. This paper completes the literature on the 80-core terascale processor by fully defining the chip's software environment. We describe the instruction set, the programming environment, the kernels written for the chip, and our experiences programming this microprocessor. We close by discussing the lessons learned from this project and what it implies for future message passing, network-on-a-chip processors.