Unifying manycore and FPGA processing with the RUSH architecture

Because of the constraints of space computing, the set of available processing technologies is limited. Conventionally, designers have had to choose from programmable rad-hard processors and fixed ASIC solutions. FPGAs provide significantly better power-performance efficiency than general purpose processors, but are more costly to program and are less flexible. For terrestrial applications, manycore processors have been adopted for a class of applications where both performance and flexible programmability are important metrics. Maestro, the first rad-hard manycore processor, has the potential to enable new capabilities for space computation. However, for many applications, certain timing-critical tasks still require the performance efficiency of an FPGA co-processor. Moreover, integrating such heterogeneous systems is challenging because the individual processing substrates have differing internal programming models. As a result, data sharing and dynamic workload scheduling across heterogeneous architectures are often suboptimal and hindered by poor scalability. The Rad-hard Unified Scalable Heterogeneous (RUSH) architecture is a heterogeneous processing platform with both a manycore chip and an FPGA. RUSH provides a unified programming model across both chips to allow for rapid development of scalable and efficient implementations. This paper overviews RUSH's technical approach and presents an example application: a WiMAX radio transceiver.

[1]  Carlos Villalpando,et al.  Reliable multicore processors for NASA space missions , 2011, 2011 Aerospace Conference.

[2]  Stephen P. Crago,et al.  FFTW and Complex Ambiguity Function performance on the Maestro processor , 2011, 2011 Aerospace Conference.

[3]  Steven S. Lumetta,et al.  CIGAR: Application Partitioning for a CPU/Coprocessor Architecture , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[4]  David Wentzlaff,et al.  Processor: A 64-Core SoC with Mesh Interconnect , 2010 .

[5]  Henry Hoffmann,et al.  The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.

[6]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[7]  Srinivasan Seshan,et al.  Enabling MAC Protocol Implementations on Software-Defined Radios , 2009, NSDI.

[8]  Kim P. Gostelow,et al.  The design of a fault-tolerant, real-time, multi-core computer system , 2011, 2011 Aerospace Conference.

[9]  Tara Estlin,et al.  Using a multicore processor for rover autonomous science , 2011, 2011 Aerospace Conference.

[10]  James C. Hoe,et al.  Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs? , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[11]  Thomas Steinke,et al.  Programming Challenges for the Implementation of Numerical Quadrature in Atomic Physics on FPGA and GPU Accelerators , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[12]  Timothy Gallagher,et al.  Natural Feature Tracking on the OPERA Maestro platform , 2011, 2011 Aerospace Conference.