Implementing dynamic implied addressing mode for multi-output instructions

The ever-increasing demand for faster execution time, smaller resource usage and lower energy consumption has compelled architects of embedded processors to adopt more specialized hardware features with irregular data paths and heterogeneous registers that are customized to the needs of their target applications. These processors consequently provide a rich set of specialized instructions in order to enable programmers to access these features. Such an instruction is typically a multi-output instruction (MOI), which outputs multiple results parallely in order to exploit inherent underlying hardware parallelism. Earlier study has exhibited that MOIs help to enhance performance in aspect of instruction counts and code size. However, as MOIs require more operands, they tend to increase not only the size of the instruction set but also the size of individual instructions. This can be a serious setback for embedded processors, which are mostly subject to strong resource limitations (particularly in this case, limited instruction encoding space). For this reason, these processors are often allowed to include only a very small subset of the total desired MOIs in their instruction sets, despite there can be sufficient silicon real estate to accommodate these specialized MOIs. To attack this problem, we introduce a novel instruction encoding scheme based on the dynamic implied addressing mode (DIAM). In this paper, we will discuss how we have overcome the encoding space problem for our target embedded processor whose instruction set has been augmented with a variety of MOIs. Our DIAM-based encoding scheme employs a small on-chip buffer to supplement extra encoding information for MOIs at run time. The empirical results are promising: the scheme allows us to encode many more MOIs for our processor; thereby helping us to achieve considerable reduction of code size as well as running time after the DIAM is additively implemented in the original architecture.

[1]  Rainer Leupers,et al.  Instruction selection for embedded DSPs with complex instructions , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[2]  장훈,et al.  [서평]「Computer Organization and Design, The Hardware/Software Interface」 , 1997 .

[3]  Koichi Yamazaki,et al.  A note on greedy algorithms for the maximum weighted independent set problem , 2003, Discret. Appl. Math..

[4]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[5]  Rainer Leupers,et al.  A code-generator generator for Multi-Output Instructions , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[6]  David A. Patterson,et al.  Computer Organization and Design, Fourth Edition, Fourth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) , 2008 .

[7]  Gary S. Tyson,et al.  Improving program efficiency by packing instructions into registers , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[8]  Ahmad Zmily,et al.  Block-aware instruction set architecture , 2006, TACO.

[9]  Heinrich Meyr,et al.  Architecture implementation using the machine description language LISA , 2002, Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design.

[10]  Christopher W. Fraser,et al.  BURG: fast optimal instruction selection and tree parsing , 1992, SIGP.

[11]  Christopher W. Fraser,et al.  Engineering E cient Code Generators using Tree Matching and Dynamic Programming , 2007 .

[12]  Gary S. Tyson,et al.  An energy efficient instruction set synthesis framework for low power embedded system designs , 2005, IEEE Transactions on Computers.

[13]  Yunheung Paek,et al.  Two versions of architectures for dynamic implied addressing mode , 2010, J. Syst. Archit..

[14]  Heinrich Meyr,et al.  LISA—machine description language for cycle-accurate models of programmable DSP architectures , 1999, DAC '99.

[15]  Yunheung Paek,et al.  Iterative Algorithm for Compound Instruction Selection with Register Coalescing , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[16]  Kingshuk Karuri,et al.  ASIP architecture exploration for efficient IPSec encryption: A case study , 2004, TECS.