Compiler optimization of embedded applications for an adaptive SoC architecture

Adaptive Explicitly Parallel Instruction Computing (AEPIC) is a stylized form of a reconfigurable system-on-a-chip that is designed to enable compiler control of reconfigurable resources. In this paper, and for the first time, we validate the viability of automating two key optimizations proposed in the AEPIC compilation framework: configuration allocation and configuration scheduling.The AEPIC architecture is comprised of an Explicitly Parallel Instruction Computing (EPIC) core coupled with an adaptive fabric and architectural features to support dynamic management of the fabric. We show that this approach to compiler-centric hardware customization, originally proposed by Palem, Talla, Devaney and Wong ([26],[27]), yields speedups with factors from 150% to over 600% for embedded applications, when compared with general purpose and digital signal processor solutions. We also provide a normalized cost analysis for our performance gains, where the normalization is based on the area of silicon required. In addition, we provide an analysis of the AEPIC architectural space, where we identify the "sweet-spot" of performance on the AEPIC architecture by examining the performance across benchmarks and computational resource configurations. Finally, we have a preliminary result for how our compiler-based approach impacts productivity metrics in the development of hardware/software partitioned custom solutions. Our implementation and validation platform is based on the well-known TRIMARAN optimizing compiler infrastructure [13].

[1]  Krishna V. Palem,et al.  Scheduling time-critical instructions on RISC machines , 1989, TOPL.

[2]  Scott A. Mahlke,et al.  Processor Acceleration Through Automated Instruction Set Customization , 2003, MICRO.

[3]  Scott A. Mahlke,et al.  FLASH: foresighted latency-aware scheduling heuristic for processors with customized datapaths , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[4]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[5]  Scott A. Mahlke,et al.  IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors , 1998, 25 Years ISCA: Retrospectives and Reprints.

[6]  Microsystems Sun,et al.  Jini^ Architecture Specification Version 2.0 , 2003 .

[7]  Tulika Mitra,et al.  Characterizing embedded applications for instruction-set extensible processors , 2004, Proceedings. 41st Design Automation Conference, 2004..

[8]  Krishna V. Palem,et al.  Compiler Optimizations for Adaptive EPIC Processors , 2001, EMSOFT.

[9]  B. R. Rau,et al.  HPL-PD Architecture Specification:Version 1.1 , 2000 .

[10]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[11]  Pedro C. Diniz,et al.  Using estimates from behavioral synthesis tools in compiler-directed design space exploration , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[12]  Steven S. Muchnick,et al.  Efficient instruction scheduling for a pipelined architecture , 1986, SIGPLAN '86.

[13]  Pedro C. Diniz,et al.  A compiler approach to fast hardware design space exploration in FPGA-based systems , 2002, PLDI '02.

[14]  Ellis Horowitz,et al.  Software Cost Estimation with COCOMO II , 2000 .

[15]  Krishna V. Palem,et al.  Adaptive explicitly parallel instruction computing , 2001 .

[16]  Monica S. Lam,et al.  Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..

[17]  B. Ramakrishna Rau,et al.  EPIC: Explicititly Parallel Instruction Computing , 2000, Computer.

[18]  Lori L. Pollock,et al.  A Region-based Partial Inlining Algorithm for an ILP Optimizing Compiler , 2002, PDPTA.