Design and evaluation of compact ISA extensions

The modern embedded market massively relies on RISC processors. The code density of such processors directly affects memory usage, an expensive resource. Solutions to mitigate this issue include code compression techniques and ISAs extensions with reduced instructions bit-width, such as Thumb2 and MicroMIPS. This paper proposes a 16-bit extension to the SPARC processor, the SPARC16. Additionally, we provide the first methodology for generating 16-bit ISAs and evaluate compression among different 16-bit extensions. SPARC16 programs can achieve better compression ratios than other extensions, attaining results as low as 67%. Moreover, SPARC16 reduces cache miss rates up to 9%, requiring smaller caches than SPARC processors to achieve the same performance; a cache size reduction that can reach a factor of 16.

[1]  Jörg Henkel,et al.  Instruction Re-encoding Facilitating Dense Embedded Code , 2008, 2008 Design, Automation and Test in Europe.

[2]  Koen De Bosschere,et al.  Link-time compaction and optimization of ARM executables , 2007, TECS.

[3]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[4]  Corporate SPARC architecture manual - version 8 , 1992 .

[5]  Cheng Xu,et al.  Efficient code size reduction without performance loss , 2007 .

[6]  Jörg Henkel,et al.  Design and simulation of a pipelined decompression architecture for embedded systems , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).

[7]  Darko Kirovski,et al.  Procedure Based Program Compression , 2004, International Journal of Parallel Programming.

[8]  Amir Roth,et al.  The implementation and evaluation of dynamic code decompression using DISE , 2005, TECS.

[9]  Lars Clausen,et al.  Java Bytecode Compression for Embedded Systems , 1988 .

[10]  Christopher W. Fraser,et al.  Custom Instruction Sets for Code Compression , 1995 .

[11]  Easwaran Raman,et al.  MAO — An extensible micro-architectural optimizer , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[12]  Igor Böhm,et al.  Integrated instruction selection and register allocation for compact code generation exploiting freeform mixing of 16- and 32-bit instructions , 2010, CGO '10.

[13]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[14]  Donald S. Fussell,et al.  16-bit vs. 32-bit instructions for pipelined microprocessors , 1993, ISCA '93.

[15]  Jörg Henkel,et al.  Code compression for low power embedded system design , 2000, Proceedings 37th Design Automation Conference.

[16]  A. Wolfe,et al.  Executing Compressed Programs On An Embedded RISC Architecture , 1992, [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.

[17]  P. P. Chakrabarti,et al.  Post-compilation optimization for multiple gains with pattern matching , 2005, SIGP.

[18]  Jörg Henkel,et al.  Efficient Code Density Through Look-up Table Compression , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[19]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[20]  Prabhat Mishra,et al.  An Efficient Code Compression Technique using Application-Aware Bitmask and Dictionary Selection Methods , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[21]  Koen De Bosschere,et al.  Link-time binary rewriting techniques for program compaction , 2005, TOPL.

[22]  Luca Benini,et al.  Code compression architecture for cache energy minimisation in embedded systems , 2002 .

[23]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[24]  Rajiv Gupta,et al.  Profile guided selection of ARM and thumb instructions , 2002, LCTES/SCOPES '02.

[25]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[26]  A.G.M. Cilio Code Generation and Optimization for Embedded Processors , 2002 .

[27]  Eduardo Afonso Billo,et al.  Design of a Decompressor Engine on a SPARC Processor , 2005, 2005 18th Symposium on Integrated Circuits and Systems Design.

[28]  Rodolfo Azevedo,et al.  Multi-profile based code compression , 2004, Proceedings. 41st Design Automation Conference, 2004..

[29]  S. R. Jones,et al.  High performance code compression architecture for the embedded ARM/THUMB processor , 2004, CF '04.

[30]  Rajeev Kumar,et al.  Code compression for performance enhancement of variable-length embedded processors , 2008, TECS.

[31]  Alan D. George,et al.  RapidIO for radar processing in advanced space systems , 2007, TECS.

[32]  Stan Y. Liao,et al.  Code generation and optimization for embedded digital signal processors , 1996 .

[33]  Tibor Gyimóthy,et al.  Survey of code-size reduction methods , 2003, CSUR.

[34]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[35]  Mats Brorsson,et al.  Two-Level Dictionary Code Compression: A New Scheme to Improve Instruction Code Density of Embedded Applications , 2009, 2009 International Symposium on Code Generation and Optimization.

[36]  П. Довгалюк,et al.  Два способа организации механизма полносистемного детерминированного воспроизведения в симуляторе QEMU , 2012 .

[37]  Kurt Keutzer,et al.  Code density optimization for embedded DSP processors using data compression techniques , 1995, Proceedings Sixteenth Conference on Advanced Research in VLSI.

[38]  Tughrul Arslan,et al.  Code Compression and Decompression for Instruction Cell Based Reconfigurable Systems , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[39]  Rodolfo Azevedo,et al.  Expression-tree-based algorithms for code compression on embedded RISC architectures , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[40]  Kevin D. Kissell MIPS16: High-density MIPS for the Embedded Market1 , 1997 .

[41]  Rajiv Gupta,et al.  Dynamic coalescing for 16-bit instructions , 2005, TECS.

[42]  Roman Bartosiński,et al.  The LEON3 Processor , 2013 .

[43]  Eduardo C. Xavier,et al.  SPARC16: A New Compression Approach for the SPARC Architecture , 2009, 2009 21st International Symposium on Computer Architecture and High Performance Computing.

[44]  Leyla Nazhandali,et al.  A hybrid code compression technique using bitmask and prefix encoding with enhanced dictionary selection , 2007, CASES '07.

[45]  Corporate Unix Press System V application binary interface (3rd ed.) , 1993 .

[46]  Prabhat Mishra,et al.  Efficient Placement of Compressed Code for Parallel Decompression , 2009, 2009 22nd International Conference on VLSI Design.