High-level synthesis of approximate hardware under joint precision and voltage scaling

In recent years, approximate computing has emerged as a promising approach to trade off quality of computed outputs for energy savings. In this paper, we present an approximate high-level synthesis (AHLS) approach that outputs a quality-energy optimized register-transfer-level implementation from an accurate high-level C description. Existing AHLS work only considers switching activity for energy savings under hardware approximations. By contrast, we aim to provide a general AHLS solution that also considers voltage scaling given a reduced processing time. To maximize voltage and associated energy reductions, we include both operation-level approximations by bit rounding and more aggressive operation eliminations as approximation techniques. Optimally exploiting scaling opportunities under such approximations requires tight interaction with scheduling tasks. We address this problem by combining an optimization pass that estimates the scheduling impact of approximations with fast yet accurate quality-energy models and an efficient optimization solver to find near-optimal solutions constructively. Results show that when considering voltage scaling, up to 24.5% higher energy savings can be achieved compared to approaches that only consider switching activity. Our heuristic solver is able to find solutions within 0.1% of average energy savings compared to an exhaustive search, all while being up to 1,400 χ faster than simulation-based methods.

[1]  Sherief Reda,et al.  ABACUS: A technique for automated behavioral synthesis of approximate computing circuits , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Serge J. Belongie,et al.  SD-VBS: The San Diego Vision Benchmark Suite , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[3]  Sherief Reda,et al.  Automated High-Level Generation of Low-Power Approximate Computing Circuits , 2019, IEEE Transactions on Emerging Topics in Computing.

[4]  John Lach,et al.  A methodology for energy-quality tradeoff using imprecise hardware , 2012, DAC Design Automation Conference 2012.

[5]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[6]  Jason Cong,et al.  An efficient and versatile scheduling algorithm based on SDC formulation , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[7]  Andreas Gerstlauer,et al.  Statistical quality modeling of approximate hardware , 2016, 2016 17th International Symposium on Quality Electronic Design (ISQED).

[8]  L.P.P.P. van Ginneken,et al.  Buffer placement in distributed RC-tree networks for minimal Elmore delay , 1990, IEEE International Symposium on Circuits and Systems.

[9]  Ku He,et al.  Modeling and synthesis of quality-energy optimal approximate adders , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[10]  Kaushik Roy,et al.  SALSA: Systematic logic synthesis of approximate circuits , 2012, DAC Design Automation Conference 2012.

[11]  Wei Luo,et al.  Joint precision optimization and high level synthesis for approximate computing , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Wayne Luk,et al.  Ieee Transactions on Computer-aided Design of Integrated Circuits and Systems Accuracy Guaranteed Bit-width Optimization Abstract— We Present Minibit, an Automated Static Approach for Optimizing Bit-widths of Fixed-point Feedforward Designs with Guaranteed Accuracy. Methods to Minimize Both the In- , 2022 .

[13]  Jason Helge Anderson,et al.  LegUp: high-level synthesis for FPGA-based processor/accelerator systems , 2011, FPGA '11.

[14]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..