Dark Wires and the Opportunities for Reconfigurable Logic

Power has become a fundamental limit to silicon performance. Most research has focused on reducing transistor switching to constrain power (dark silicon.) Specialized accelerators have been proposed since they implement functionality with fewer transistor switches than general purpose cores. Increasing efficiency requirements lead to more specialization and, therefore, more accelerators that potentially leads to longer distances to get to all the accelerators. Communication, however, consumes energy, and therefore needs to be minimized as well (dark wires.) This paper examines the balance between compute and communication specialization in the context of hard logic (e.g., ASIC) that is highly efficient but static versus soft logic (e.g., FPGA) that is less efficient but allows computation to be moved to reduce communication distances. Our experimental results show using soft accelerators consumes 0.6x-2.1x total power compared to using hard accelerators when communication costs are taken into account.

[1]  Gu-Yeon Wei,et al.  MachSuite: Benchmarks for accelerator design and customized architectures , 2014, 2014 IEEE International Symposium on Workload Characterization (IISWC).

[2]  Jason Cong,et al.  CHARM: a composable heterogeneous accelerator-rich microprocessor , 2012, ISLPED '12.

[3]  Jason Cong,et al.  Composable accelerator-rich microprocessor enhanced for adaptivity and longevity , 2013, International Symposium on Low Power Electronics and Design (ISLPED).

[4]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[5]  Sriram R. Vangal,et al.  A 5-GHz Mesh Interconnect for a Teraflops Processor , 2007, IEEE Micro.

[6]  Jürgen Becker,et al.  Power estimation and power measurement of Xilinx Virtex FPGAs: trade-offs and limitations , 2003, 16th Symposium on Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings..

[7]  Saeed Sharifian,et al.  An ultra-high throughput and fully pipelined implementation of AES algorithm on FPGA , 2015, Microprocess. Microsystems.

[8]  Gu-Yeon Wei,et al.  Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[9]  Vaughn Betz,et al.  Comparing FPGA vs. custom cmos and the impact on processor microarchitecture , 2011, FPGA '11.

[10]  Jonathan Rose,et al.  Measuring the Gap Between FPGAs and ASICs , 2007, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[11]  Collin McCurdy,et al.  The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.