Improving FPGA-Based Logic Emulation Systems through Machine Learning

We present a machine learning (ML) framework to improve the use of computing resources in the FPGA compilation step of a commercial FPGA-based logic emulation flow. Our ML models enable highly accurate predictability of the final place and route design qualities, runtime, and optimal mapping parameters. We identify key compilation features that may require aggressive compilation efforts using our ML models. Experiments based on our large-scale database from an industry’s emulation system show that our ML models help reduce the total number of jobs required for a given netlist by 33%. Moreover, our job scheduling algorithm based on our ML model reduces the overall time to completion of concurrent compilation runs by 24%. In addition, we propose a new method to compute “recommendations” from our ML model to perform re-partitioning of difficult partitions. Tested on a large-scale industry system on chip design, our recommendation flow provides additional 15% compile time savings for the entire system on chip. To exploit our ML model inside the time-critical multi-FPGA partitioning step, we implement it in an optimized multi-threaded representation.

[1]  Gary William Grewal,et al.  Machine-Learning Based Congestion Estimation for Modern FPGAs , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

[2]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[3]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[4]  Pingfan Meng,et al.  Adaptive Threshold Non-Pareto Elimination: Re-thinking machine learning for system level design space exploration on FPGAs , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Nachiket Kapre,et al.  Boosting convergence of timing closure using feature selection in a Learning-driven approach , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[6]  Jiang Hu,et al.  Detecting tangled logic structures in VLSI netlists , 2010, Design Automation Conference.

[7]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[8]  Gary William Grewal,et al.  Automatic Flow Selection and Quality-of-Result Estimation for FPGA Placement , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[9]  Jieru Zhao,et al.  Machine Learning Based Routing Congestion Prediction in FPGA High-Level Synthesis , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Zhiru Zhang,et al.  A Parallel Bandit-Based Approach for Autotuning FPGA Compilation , 2017, FPGA.

[11]  Nachiket Kapre,et al.  Driving Timing Convergence of FPGA Designs through Machine Learning and Cloud Computing , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[12]  Federico Della Croce,et al.  Longest Processing Time rule for identical parallel machines revisited , 2018, ArXiv.

[13]  Prasanna Balaprakash,et al.  Autotuning FPGA Design Parameters for Performance and Power , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[14]  William N. N. Hung,et al.  Challenges in Large FPGA-based Logic Emulation Systems , 2018, ISPD.

[15]  G. Wahba,et al.  Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2004 .

[16]  Andrew B. Kahng,et al.  Machine Learning Applications in Physical Design: Recent Results and Directions , 2018, ISPD.

[17]  Sameer Singh,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[18]  Evangeline F. Y. Young,et al.  Fast and Accurate Estimation of Quality of Results in High-Level Synthesis with Machine Learning , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[19]  V. Kamakoti,et al.  Placement and routing for 3D-FPGAs using reinforcement learning and support vector machines , 2005, 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design.

[20]  Jason Cong,et al.  S2FA: An Accelerator Automation Framework for Heterogeneous Computing in Datacenters , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[21]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: applications in VLSI domain , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[22]  Evangeline F. Y. Young,et al.  Clock-aware ultrascale FPGA placement with machine learning routability prediction: (Invited paper) , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).