A Parallelized Iterative Improvement Approach to Area Optimization for LUT-Based Technology Mapping

Modern FPGA synthesis tools typically apply a predetermined sequence of logic optimizations on the input logic network before carrying out technology mapping. While the "known recipes" of logic transformations often lead to improved mapping results, there remains a nontrivial gap between the quality metrics driving the pre-mapping logic optimizations and those targeted by the actual technology mapping. Needless to mention, such miscorrelations would eventually result in suboptimal quality of results. In this paper we propose PIMap, which couples logic transformations and technology mapping under an iterative improvement framework to minimize the circuit area for LUT-based FPGAs. In each iteration, PIMap randomly proposes a transformation on the given logic network from an ensemble of candidate optimizations; it then invokes technology mapping and makes use of the mapping result to determine the likelihood of accepting the proposed transformation. To mitigate the runtime overhead, we further introduce parallelization techniques to decompose a large design into multiple smaller sub-netlists that can be optimized simultaneously. Experimental results show that our approach achieves promising area improvement over a set of commonly used benchmarks. Notably, PIMap reduces the LUT usage by up to 14% and 7% on average over the best-known records for the EPFL arithmetic benchmark suite.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  Jason Cong,et al.  FPGA Design Automation: A Survey , 2006, Found. Trends Electron. Des. Autom..

[3]  Lingli Wang,et al.  Lazy man's logic synthesis , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[4]  Majid Sarrafzadeh,et al.  Complexity of the lookup-table minimization problem for FPGA technology mapping , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  R. Brayton,et al.  Scalable Logic Synthesis using a Simple Circuit Structure , 2006 .

[6]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[7]  Alexander Aiken,et al.  Stochastic superoptimization , 2012, ASPLOS '13.

[8]  Giovanni De Micheli,et al.  The EPFL Combinational Benchmark Suite , 2015 .

[9]  C. L. Liu,et al.  Optimal clock period clustering for sequential circuits with retiming , 1997, Proceedings International Conference on Computer Design VLSI in Computers and Processors.

[10]  Robert K. Brayton,et al.  DAG-aware AIG rewriting: a fresh look at combinational logic synthesis , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[11]  Giovanni De Micheli,et al.  Majority-Inverter Graph: A New Paradigm for Logic Optimization , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Yu Hu,et al.  FPGA area reduction by multi-output function based sequential resynthesis , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[13]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[14]  Dirk P. Kroese,et al.  Monte Carlo Sampling , 2014 .

[15]  Robert K. Brayton,et al.  Improvements to Technology Mapping for LUT-Based FPGAs , 2007, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[16]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .