Architecture-Aware Packing and CAD Infrastructure for Field-Programmable Gate Arrays

Architecture-Aware Packing and CAD Infrastructure for Field-Programmable Gate Arrays Jason Luu Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto 2014 As technology scaling, human creativity, and other factors open new markets for FPGAs, the architectures of such chips must continue to evolve to meet changing demands. However, public domain software tools available to explore future FPGA architectures have not kept pace with advances in the field. Furthermore, these tools often have strong architectural assumptions embedded within the source code itself. Thus, short of major software rewriting, this limits the use of these tools to simple variations of particular architectures. In this thesis, we describe contributions to a large open-source collaborative project, called Verilog-to-Routing (VTR), that relaxes such limitations by providing an extensive software infrastructure for FPGA architecture exploration and CAD research. This infrastructure includes modern benchmarks, sample architecture description files, and a CAD flow that can target a broad space of architectures. We then describe new techniques in the packing stage of the CAD flow, which allow the packer to both target FPGAs with modern architectural features, as well as adjust computational effort based on architectural complexity. Finally, we conduct an architecture experiment on hard adders and carry chains to show the new capabilities of the software infrastructure and to quantitatively answer questions about the actual effectiveness of these classical architectural features.

[1]  Peter M. Athanas,et al.  Torc: towards an open-source tool flow , 2011, FPGA '11.

[2]  Jason Luu,et al.  VPR 5.0: FPGA cad and architecture exploration tools with single-driver routing, heterogeneity and process scaling , 2009, FPGA '09.

[3]  Jianwen Zhu,et al.  Scalable Synthesis and Clustering Techniques Using Decision Diagrams , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Vaughn Betz,et al.  Architecture and CAD for Deep-Submicron FPGAS , 1999, The Springer International Series in Engineering and Computer Science.

[5]  Guy Lemieux,et al.  Design of interconnection networks for programmable logic , 2003 .

[6]  P. R. Stephan,et al.  SIS : A System for Sequential Circuit Synthesis , 1992 .

[7]  Neil Joseph Steiner Autonomous Computing Systems , 2008 .

[8]  Steven J. E. Wilton,et al.  VersaPower: Power estimation for diverse FPGA architectures , 2012, 2012 International Conference on Field-Programmable Technology.

[9]  Majid Sarrafzadeh,et al.  RPack: routability-driven packing for cluster-based FPGAs , 2001, ASP-DAC '01.

[10]  Brent E. Nelson,et al.  RapidSmith: Do-It-Yourself CAD Tools for Xilinx FPGAs , 2011, 2011 21st International Conference on Field Programmable Logic and Applications.

[11]  David M. Lewis,et al.  Architectural enhancements in Stratix V™ , 2013, FPGA '13.

[12]  Daniele Giuseppe Paladino,et al.  Academic Clustering and Placement Tools for Modern Field-Programmable Gate Array Architectures , 2008 .

[13]  Andrew A. Kennings,et al.  Improving Timing-Driven FPGA Packing with Physical Information , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[14]  Anthony J. Yu,et al.  Directional and single-driver wires in FPGA interconnect , 2004, Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921).

[15]  Stephen Dean Brown,et al.  The Quartus University Interface Program: enabling advanced FPGA research , 2004, Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921).

[16]  Steven J. E. Wilton,et al.  A Flexible Power Model for FPGAs , 2002, FPL.

[17]  Jonathan Rose,et al.  Area and delay trade-offs in the circuit and architecture design of FPGAs , 2008, FPGA '08.

[18]  Jason Luu,et al.  Towards interconnect-adaptive packing for FPGAs , 2014, FPGA.

[19]  Jason Cong,et al.  Optimal simultaneous mapping and clustering for FPGA delay optimization , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[20]  Wayne Luk,et al.  Floating-Point FPGA: Architecture and Modeling , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[21]  Jonathan Rose,et al.  Automated transistor sizing for FPGA architecture exploration , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[22]  David M. Lewis,et al.  Architectural enhancements in Stratix-III™ and Stratix-IV™ , 2009, FPGA '09.

[23]  Shanzhen Xing,et al.  FPGA Adders: Performance Evaluation and Optimal Design , 1998, IEEE Des. Test Comput..

[24]  Jason Luu,et al.  Architecture description and packing for logic blocks with hierarchy, modes and complex interconnect , 2011, FPGA '11.

[25]  Robert K. Brayton,et al.  DAG-aware AIG rewriting: a fresh look at combinational logic synthesis , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[26]  Jinmei Lai,et al.  A new FPGA packing algorithm based on the modeling method for logic block , 2005, 2005 6th International Conference on ASIC.

[27]  Jonathan Rose,et al.  Improving FPGA routing architectures using architecture and CAD interactions , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[28]  Paolo Ienne,et al.  Efficient synthesis of compressor trees on FPGAs , 2008, 2008 Asia and South Pacific Design Automation Conference.

[29]  Paolo Ienne,et al.  Revisiting and-inverter cones , 2014, FPGA.

[30]  Carl Ebeling,et al.  PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs , 1995, Third International ACM Symposium on Field-Programmable Gate Arrays.

[31]  Steven J. E. Wilton,et al.  Logical-to-Physical Memory Mapping for FPGAs with Dual-Port Embedded Arrays , 1999, FPL.

[32]  Sen Wang,et al.  VTR 7.0: Next Generation Architecture and CAD System for FPGAs , 2014, TRETS.

[33]  Scott Hauck,et al.  High-performance carry chains for FPGA's , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[34]  Carl Ebeling,et al.  Architecture-adaptive routability-driven placement for FPGAs , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[35]  Rob A. Rutenbar,et al.  A comparative study of two Boolean formulations of FPGA detailed routing constraints , 2001, IEEE Transactions on Computers.

[36]  Vaughn Betz,et al.  COFFE: Fully-automated transistor sizing for FPGAs , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[37]  Jonathan Rose,et al.  Architecting Hard Crossbars on FPGAs and Increasing their Area Efficiency with Shadow Clusters , 2007, 2007 International Conference on Field-Programmable Technology.

[38]  R. Njuguna A Survey of FPGA Benchmarks , 2008 .

[39]  Alan Mishchenko,et al.  WireMap: FPGA technology mapping for improved routability , 2008, FPGA '08.

[40]  Jason Helge Anderson,et al.  Architecture-specific packing for virtex-5 FPGAs , 2008, FPGA '08.

[41]  Kenneth B. Kent,et al.  Odin II - An Open-Source Verilog HDL Synthesis Tool for CAD Research , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[42]  K. Keutzer DAGON: Technology Binding and Local Optimization by DAG Matching , 1987, 24th ACM/IEEE Design Automation Conference.

[43]  Wenyi Feng,et al.  Rent's rule based FPGA packing for routability optimization , 2014, FPGA.

[44]  Vaughn Betz,et al.  A fast routability-driven router for FPGAs , 1998, FPGA '98.

[45]  Bo Yan,et al.  On Hard Adders and Carry Chains in FPGAs , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[46]  Steven J. E. Wilton,et al.  Escaping the Academic Sandbox: Realizing VPR Circuits on Xilinx Devices , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.

[47]  J. Rose,et al.  The effect of LUT and cluster size on deep-submicron FPGA performance and density , 2000, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[48]  Kenneth B. Kent,et al.  The VTR project: architecture and CAD for FPGAs from verilog to routing , 2012, FPGA '12.

[49]  Vaughn Betz,et al.  Comparing FPGA vs. custom cmos and the impact on processor microarchitecture , 2011, FPGA '11.

[50]  Vaughn Betz,et al.  Titan: Enabling large and complex benchmarks in academic CAD , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[51]  Jason Luu,et al.  A Hierarchical Description Language and Packing Algorithm for Heterogenous FPGAs , 2010 .

[52]  Majid Sarrafzadeh,et al.  Routability-Driven Packing: Metrics And Algorithms For Cluster-Based FPGAs , 2004, J. Circuits Syst. Comput..

[53]  Ehud Sharlin,et al.  BuildBot: Robotic Monitoring of Agile Software Development Teams , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.

[54]  S. Yang,et al.  Logic Synthesis and Optimization Benchmarks User Guide Version 3.0 , 1991 .

[55]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[56]  Nikil D. Dutt,et al.  Library mapping for memories , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[57]  Rob A. Rutenbar,et al.  A comparative study of two Boolean formulations of FPGA detailed routing constraints , 2001, ISPD '01.

[58]  Jonathan Rose,et al.  Hard vs. soft: the central question of pre-fabricated silicon , 2004, Proceedings. 34th International Symposium on Multiple-Valued Logic.

[59]  Jianwen Zhu,et al.  Towards scalable placement for FPGAs , 2010, FPGA '10.

[60]  Paolo Ienne,et al.  A Case for Heterogeneous Technology-Mapping: Soft Versus Hard Multiplexers , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.

[61]  Jason Cong,et al.  RASP: A General Logic Synthesis System for SRAM-Based FPGAs , 1996, Fourth International ACM Symposium on Field-Programmable Gate Arrays.

[62]  Peter Y. K. Cheung,et al.  FPGA Architecture Optimization Using Geometric Programming , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[63]  Guy Lemieux,et al.  ZUMA: An Open FPGA Overlay Architecture , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[64]  Josef Stoer,et al.  Numerische Mathematik 1 , 1989 .

[65]  P. Alfke,et al.  Third-Generation Architecture Boosts Speed And Density Of Field-Programmable Gate Arrays , 1991, Electro International, 1991.

[66]  Kenneth B. Kent,et al.  VPR 5.0: FPGA CAD and architecture exploration tools with single-driver routing, heterogeneity and process scaling , 2011, TRETS.

[67]  Nam Sung Woo,et al.  Revisiting the Cascade Circuit in Logic Cells of Lookup Table Based FPGAs , 1995, Third International ACM Symposium on Field-Programmable Gate Arrays.

[68]  Xuegong Zhou,et al.  A novel packing algorithm for sparse crossbar FPGA architectures , 2008, 2008 9th International Conference on Solid-State and Integrated-Circuit Technology.

[69]  Paolo Ienne,et al.  A novel FPGA logic block for improved arithmetic performance , 2008, FPGA '08.

[70]  SinghAmit,et al.  Efficient circuit clustering for area and power reduction in FPGAs , 2002 .

[71]  Vaughn Betz,et al.  Using cluster-based logic blocks and timing-driven packing to improve FPGA speed and density , 1999, FPGA '99.

[72]  Kenneth B. Kent,et al.  Improving memory support in the VTR flow , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[73]  Jason Cong,et al.  FlowMap: an optimal technology mapping algorithm for delay optimization in lookup-table based FPGA designs , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[74]  Kurt Keutzer DAGON: Technology Binding and Local Optimization by DAG Matching , 1987, DAC.

[75]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[76]  Fei Li,et al.  A 65nm flash-based FPGA fabric optimized for low cost and power , 2011, FPGA '11.

[77]  Vaughn Betz,et al.  The Stratix II logic and routing architecture , 2005, FPGA '05.

[78]  Wenyi Feng K-way partitioning based packing for FPGA logic blocks without input bandwidth constraint , 2012, 2012 International Conference on Field-Programmable Technology.

[79]  Jonathan Rose,et al.  Technology Mapping for Heterogeneous FPGAs , 1994 .

[80]  Steven J. E. Wilton,et al.  Architectures and algorithms for field-programmable gate arrays with embedded memory , 1997 .

[81]  Vaughn Betz,et al.  The stratixπ routing and logic architecture , 2003, FPGA '03.