Compilation Method of Reconfigurable Cryptographic Processors

As an implementation of reconfigurable computing processors in specific fields, a reconfigurable cryptographic processor inherits the basic compilation framework of reconfigurable computing processors: The algorithm is described in high-level programming languages; the hardware and software partition is made through the static or dynamic analysis; then, the hardware part is transformed into the universal intermediate representation through the front-end compilation tools, which is then optimized through the middle-end compilation tools; finally, the mapping is implemented through back-end compilation tools including the synthesis tool, placement and routing tool, and the configuration information of the reconfigurable computing structure is generated. This chapter will be based on this framework and consider the particularity of the compilation method of reconfigurable cryptographic processors. As a cipher algorithm has many obvious code features such as the fixed-boundary loop, loop-carried data dependency, simple control flow, and quite different data granularity, the compilation method of the compiler of a reconfigurable cryptographic processor needs to be optimized based on these features. This chapter will start with general reconfigurable computing processors and introduce their universal compilation technologies and methods, including the main steps throughout compilation process. Then, this chapter will discuss the compilation methods of reconfigurable cryptographic processors, focusing on the steps which are very important for cipher application, such as code transformation and optimization, division and mapping of intermediate representations. Finally, this chapter will give examples about compilation and implementation of different cipher algorithms.

[1]  Rudy Lauwereins,et al.  ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.

[2]  Karthikeyan Sankaralingam,et al.  Design, integration and implementation of the DySER hardware accelerator into OpenSPARC , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[3]  Aviral Shrivastava,et al.  EPIMap: Using Epimorphism to map applications on CGRAs , 2012, DAC Design Automation Conference 2012.

[4]  Jack J. Dongarra,et al.  Unrolling loops in fortran , 1979, Softw. Pract. Exp..

[5]  Leibo Liu,et al.  Compiler framework for reconfigurable computing system , 2009, 2009 International Conference on Communications, Circuits and Systems.

[6]  Cristina Nita-Rotaru,et al.  A survey of attack and defense techniques for reputation systems , 2009, CSUR.

[7]  Csaba Andras Moritz,et al.  Parallelizing applications into silicon , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[8]  John Wawrzynek,et al.  The Garp Architecture and C Compiler , 2000, Computer.

[9]  Maya Gokhale,et al.  High level compilation for fine grained FPGAs , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[10]  Jürgen Teich,et al.  Mapping of regular nested loop programs to coarse-grained reconfigurable arrays - constraints and methodology , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[11]  Monica S. Lam,et al.  RETROSPECTIVE : Software Pipelining : An Effective Scheduling Technique for VLIW Machines , 1998 .

[12]  Uday Bondhugula,et al.  A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.

[13]  Peter M. Athanas,et al.  Scheduling and partitioning ANSI-C programs onto multi-FPGA CCM architectures , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[14]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[15]  Cédric Bastoul,et al.  Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[16]  Scott A. Mahlke,et al.  Edge-centric modulo scheduling for coarse-grained reconfigurable architectures , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Scott Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.

[18]  David Parello,et al.  Facilitating the search for compositions of program transformations , 2005, ICS '05.

[19]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[20]  Wonyong Sung,et al.  AUTOSCALER for C: an optimizing floating-point to integer C program converter for fixed-point digital signal processors , 2000 .

[21]  Leibo Liu,et al.  Polyhedral model based mapping optimization of loop nests for CGRAs , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[22]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[23]  Viktor K. Prasanna,et al.  Dynamic precision management for loop computations on reconfigurable architectures , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[24]  S. Ghosh,et al.  An asynchronous approach to efficient execution of programs on adaptive architectures utilizing FPGAs , 1994, Proceedings of IEEE Workshop on FPGA's for Custom Computing Machines.

[25]  Maya Gokhale,et al.  Stream-oriented FPGA computing in the Streams-C high level language , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[26]  David R. Galloway The Transmogrifier C hardware description language and compiler for FPGAs , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[27]  Seth Copen Goldstein,et al.  Fast compilation for pipelined reconfigurable fabrics , 1999, FPGA '99.

[28]  Yao Wang,et al.  Aggressive pipelining of irregular applications on reconfigurable hardware , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[29]  Milind Girkar,et al.  Automatic Extraction of Functional Parallelism from Ordinary Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[30]  Pedro C. Diniz,et al.  Compiling for reconfigurable computing: A survey , 2010, CSUR.

[31]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2003, The Journal of Supercomputing.

[32]  Aviral Shrivastava,et al.  REGIMap: Register-aware application mapping on Coarse-Grained Reconfigurable Architectures (CGRAs) , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[33]  Seth Copen Goldstein,et al.  BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations , 2000, Euro-Par.

[34]  Michael A. Langston,et al.  Automatic Mapping of Multiple Applications to Multiple Adaptive Computing Systems , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[35]  Yu Peng,et al.  Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[36]  Prithviraj Banerjee,et al.  A C compiler for a processor with a reconfigurable functional unit , 2000, FPGA '00.

[37]  Wayne Luk,et al.  Memory access optimisation for reconfigurable systems , 2001 .

[38]  Scott A. Mahlke,et al.  Modulo graph embedding: mapping applications onto coarse-grained reconfigurable architectures , 2006, CASES '06.