Exploiting cache locality to speedup register clustering

Physical design tools must handle huge amounts of data in order to solve problems for circuits with millions of cells. Traditionally, Electronic Design Automation tools are implemented using Object-Oriented Design. However, using this paradigm may lead to overly complex objects that result in waste of cache memory space. This memory wasting harms cache locality exploration and, consequently, degrades software runtime. This work proposes applying Data-Oriented Design on the register clustering problem. Differently from the traditional Object-Oriented design, the Data-Oriented Design programming model focus on how the data is organized in the memory. As consequence, this programming model may better explore cache spatial locality. In order to evaluate the impact of using the Data-Oriented Design programming model for register clustering, we implemented two software prototypes (a sequential and a parallel implementation) of the K-means clustering algorithm for each programming model. Experimental results showed that the sequential Data-Oriented Design implementation is on average 7.5% faster when compared to the Object-Oriented Design implementation, while its parallel version is 15% faster when compared to the Object-Oriented one.

[1]  장훈,et al.  [서평]「Computer Organization and Design, The Hardware/Software Interface」 , 1997 .

[2]  Nadine Gottschalk,et al.  Vlsi Physical Design From Graph Partitioning To Timing Closure , 2016 .

[3]  Yih-Lang Li,et al.  OpenDesign Flow Database: The infrastructure for VLSI design and design automation research , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[4]  Igor L. Markov,et al.  Physical Synthesis with Clock-Network Optimization for Large Systems on Chips , 2011, IEEE Micro.

[5]  Andrew B. Kahng,et al.  Horizontal benchmark extension for improved assessment of physical CAD research , 2014, GLSVLSI '14.

[6]  Mateus Fogaça,et al.  Rsyn: An Extensible Physical Synthesis Framework , 2017, ISPD.

[7]  Juliane Junker,et al.  Computer Organization And Design The Hardware Software Interface , 2016 .

[8]  Martin D. F. Wong,et al.  OpenTimer: A high-performance timing analysis tool , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[9]  Ricardo Reis,et al.  Revisiting automated physical synthesis of high-performance clock networks , 2013, TODE.

[10]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Laércio Lima Pilla,et al.  How Game Engines Can Inspire EDA Tools Development: A use case for an open-source physical design library , 2017, ISPD.

[12]  Yao-Wen Chang,et al.  An effective legalization algorithm for mixed-cell-height standard cells , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[13]  Evangeline F. Y. Young,et al.  Legalization algorithm for multiple-row height standard cell design , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Rajeev Murgai,et al.  Clock distribution architectures: a comparative study , 2006, 7th International Symposium on Quality Electronic Design (ISQED'06).

[15]  Jin Hu,et al.  ICCAD-2015 CAD contest in incremental timing-driven placement and benchmark suite , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[16]  Yue Xu,et al.  Flip-flop clustering by weighted K-means algorithm , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).