论文信息 - ABCDPlace: Accelerated Batch-Based Concurrent Detailed Placement on Multithreaded CPUs and GPUs

ABCDPlace: Accelerated Batch-Based Concurrent Detailed Placement on Multithreaded CPUs and GPUs

Placement is an important step in modern very-large-scale integrated (VLSI) designs. Detailed placement is a placement refining procedure intensively called throughout the design flow, thus its efficiency has a vital impact on design closure. However, since most detailed placement techniques are inherently greedy and sequential, they are generally difficult to parallelize. In this article, we present a concurrent detailed placement framework, ABCDPlace, exploiting multithreading and graphic processing unit (GPU) acceleration. We propose batch-based concurrent algorithms for widely adopted sequential detailed placement techniques, such as independent set matching, global swap, and local reordering. The experimental results demonstrate that <italic>ABCDPlace</italic> can achieve <inline-formula> <tex-math notation="LaTeX">$2\times $ </tex-math></inline-formula>–<inline-formula> <tex-math notation="LaTeX">$5\times $ </tex-math></inline-formula> faster runtime than sequential implementations with multithreaded CPU and over <inline-formula> <tex-math notation="LaTeX">$10\times $ </tex-math></inline-formula> with GPU on ISPD 2005 contest benchmarks without quality degradation. On larger industrial benchmarks, we show more than <inline-formula> <tex-math notation="LaTeX">$16\times $ </tex-math></inline-formula> speedup with GPU over the state-of-the-art sequential detailed placer. ABCDPlace finishes the detailed placement of a 10-million-cell industrial design in 1 min.

[1] Jin Hu,et al. Progress and Challenges in VLSI Placement Research , 2012, Proceedings of the IEEE.

[2] David Z. Pan,et al. Triple Patterning Aware Detailed Placement Toward Zero Cross-Row Middle-of-Line Conflict , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3] Chris C. N. Chu,et al. An efficient and effective detailed placement algorithm , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[4] Yih-Lang Li,et al. NCTU-GR: Efficient Simulated Evolution-Based Rerouting and Congestion-Relaxed Layer Assignment on 3-D Global Routing , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5] Yao-Wen Chang,et al. Analytical Solution of Poisson's Equation and Its Application to VLSI Global Placement , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[6] Yao-Wen Chang,et al. Generalized Augmented Lagrangian and Its Applications to VLSI Global Placement* , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[7] Yao-Wen Chang,et al. NTUplace3: An Analytical Placer for Large-Scale Mixed-Size Designs With Preplaced Blocks and Density Constraints , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8] Martin D. F. Wong,et al. Cpp-Taskflow: Fast Task-Based Parallel Programming Using Modern C++ , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[9] Yao-Wen Chang,et al. Big: A Bivariate Gradient-Based Wirelength Model for Analytical Circuit Placement , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[10] John D. Owens,et al. Gunrock , 2017, ACM Trans. Parallel Comput..

[11] Yao-Wen Chang,et al. NTUplace4h: A Novel Routability-Driven Placement Algorithm for Hierarchical Mixed-Size Circuit Designs , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12] Dimitri P. Bertsekas,et al. A new algorithm for the assignment problem , 1981, Math. Program..

[13] Andrew B. Kahng,et al. RePlAce: Advancing Solution Quality and Routability Validation in Global Placement , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14] Yih-Lang Li,et al. Density-aware detailed placement with instant legalization , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[15] Zbigniew J. Czech,et al. Introduction to Parallel Computing , 2017 .

[16] Guy E. Blelloch,et al. Greedy sequential maximal independent set and matching are parallel on average , 2012, SPAA '12.

[17] Keshav Pingali,et al. Can Parallel Programming Revolutionize EDA Tools? , 2018, Advanced Logic Synthesis.

[18] Jianwen Zhu,et al. Parallelizing Simulated Annealing-Based Placement Using GPGPU , 2010, 2010 International Conference on Field Programmable Logic and Applications.

[19] Chris C. N. Chu,et al. FastPlace 3.0: A Fast Multilevel Quadratic Placement Algorithm with Placement Congestion Control , 2007, 2007 Asia and South Pacific Design Automation Conference.

[20] Tao Lin,et al. TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints , 2015, ISPD.

[21] John A. Chandy,et al. Parallel simulated annealing strategies for VLSI cell placement , 1996, Proceedings of 9th International Conference on VLSI Design.

[22] Pinaki Mazumder,et al. VLSI cell placement techniques , 1991, CSUR.

[23] George J. Pappas,et al. A distributed auction algorithm for the assignment problem , 2008, 2008 47th IEEE Conference on Decision and Control.

[24] Andrew B. Kahng,et al. Scalable detailed placement legalization for complex sub-14nm constraints , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[25] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .

[26] Yao-Wen Chang,et al. NTUplace: a ratio partitioning based placement algorithm for large-scale mixed-size designs , 2005, ISPD '05.

[27] Meng Li,et al. UTPlaceF 3.0: A parallelization framework for modern FPGA global placement: (Invited paper) , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[28] Jude Arun Selvan Jesuthasan. Incremental Timing-Driven Placement with Displacement Constraint , 2015 .

[29] Tung-Chieh Chen,et al. Challenges and solutions in modern analog placement , 2007, Proceedings of Technical Program of 2012 VLSI Design, Automation and Test.

[30] Gary William Grewal,et al. A scalable, serially-equivalent, high-quality parallel placement methodology suitable for modern multicore and GPU architectures , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[31] S. Sitharama Iyengar,et al. Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[32] Tao Lin,et al. POLAR 3.0: An ultrafast global placement engine , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[33] Evangeline F. Y. Young,et al. Cell density-driven detailed placement with displacement constraint , 2014, ISPD '14.

[34] David Z. Pan,et al. MrDP: Multiple-row detailed placement of heterogeneous-sized cells for advanced nodes , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[35] Tung-Chieh Chen,et al. Challenges and Solutions in Modern VLSI Placement , 2007, 2007 International Symposium on VLSI Design, Automation and Test (VLSI-DAT).

[36] Gi-Joon Nam,et al. The ISPD2005 placement contest and benchmark suite , 2005, ISPD '05.

[37] Fanica Gavril,et al. Algorithms for Minimum Coloring, Maximum Clique, Minimum Covering by Cliques, and Maximum Independent Set of a Chordal Graph , 1972, SIAM J. Comput..

[38] David Z. Pan,et al. DREAMPIace: Deep Learning Toolkit-Enabled GPU Acceleration for Modern VLSI Placement , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[39] Andrew B. Kahng,et al. On legalization of row-based placements , 2004, GLSVLSI '04.

[40] Chris C. N. Chu,et al. FastPlace: efficient analytical placement using cell shifting, iterative local refinement,and a hybrid net model , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[41] Yao-Wen Chang,et al. Mixed-Cell-Height Placement with Complex Minimum-Implant-Area Constraints , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[42] D. Chinnery,et al. ISPD 2015 Benchmarks with Fence Regions and Routing Blockages for Detailed-Routing-Driven Placement , 2015, ISPD.

[43] Ismail Bustany,et al. NTUplace4dr: A Detailed-Routing-Driven Placer for Mixed-Size Circuit Designs With Technology and Region Constraints , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[44] Jiaqi Gu,et al. DREAMPlace: Deep Learning Toolkit-Enabled GPU Acceleration for Modern VLSI Placement , 2020 .

[45] David Z. Pan,et al. GDP: GPU accelerated Detailed Placement , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[46] Chris C. N. Chu,et al. Detailed Placement Algorithm for VLSI Design With Double-Row Height Standard Cells , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[47] Jason Cong,et al. Parallel multi-level analytical global placement on graphics processing units , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.