Orthogonal Instruction Processing: An Alternative to Lightweight VLIW Processors

We propose a new processor architecture called Orthogonal Instruction Processing (OIP). Contrary to Very Long Instruction Word (VLIW) decoding, we propose to orthogonally decode the sub-instruction words of each Functional Unit (FU) instead. Hereby, the OIP architecture is able to reduce the overall machine code size of VLIW programs significantly. We willshow analytically as well as experimentally that, compared to a VLIW processor, the savings in instruction memory size easily compensate the overhead of one separate branch unit needed for each FU.For the analytical analysis, a mathematical model of hardware costs of an OIP processor is developed and compared to a conventional VLIW processor. In addition, we compare the code size of selected representative programs of the new processor architecture and show big savings of program memory. Here, the instruction memory requirements can be decreased by a factor of0.465. This decrease in instruction memory, despite the discussed overhead, leads to savings in the overall hardware costs of one processor by a factor of 0.989.

[1]  Geoffrey Brown,et al.  ρ-VEX: A reconfigurable and extensible softcore VLIW processor , 2008, 2008 International Conference on Field-Programmable Technology.

[2]  Jürgen Teich,et al.  Compact Code Generation for Tightly-Coupled Processor Arrays , 2014, J. Signal Process. Syst..

[3]  Frank Hannig,et al.  Scheduling Techniques for High-Throughput Loop Accelerators , 2009 .

[4]  Andy D. Pimentel,et al.  TriMedia CPU64 architecture , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[5]  Paul Feautrier,et al.  Polyhedron Model , 2011, Encyclopedia of Parallel Computing.

[6]  Ulrich Rückert,et al.  CoreVA: A Configurable Resource-Efficient VLIW Processor Architecture , 2014, 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing.

[7]  Jürgen Teich,et al.  Power-Efficient Reconfiguration Control in Coarse-Grained Dynamically Reconfigurable Architectures , 2009, J. Low Power Electron..

[8]  Stamatis Vassiliadis,et al.  The TM3270 media-processor , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[9]  Jürgen Teich,et al.  A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.

[10]  Jürgen Teich,et al.  Resource constrained and speculative scheduling of an algorithm class with run-time dependent conditionals , 2004, Proceedings. 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors, 2004..

[11]  Sumedh W. Sathaye,et al.  Instruction fetch mechanisms for VLIW architectures with compressed encodings , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[12]  Frank Hannig,et al.  Invasive Tightly-Coupled Processor Arrays , 2014, ACM Trans. Embed. Comput. Syst..

[13]  Yuan Xie,et al.  Code compression for embedded VLIW processors using variable-to-fixed coding , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[14]  Marc Tremblay,et al.  The MAJC Architecture: A Synthesis of Parallelism and Scalability , 2000, IEEE Micro.

[15]  Wolfgang J. Paul,et al.  Computer architecture - complexity and correctness , 2000 .

[16]  Jürgen Teich,et al.  Hierarchical power management for adaptive tightly-coupled processor arrays , 2013, TODE.

[17]  Jürgen Teich,et al.  A Dynamically Reconfigurable Weakly Programmable Processor Array Architecture Template , 2006, ReCoSoC.

[18]  Rohit Bhatia,et al.  Montecito: a dual-core, dual-thread Itanium processor , 2005, IEEE Micro.

[19]  Liam Goudge,et al.  Embedded control problems, Thumb, and the ARM7TDMI , 1995, IEEE Micro.

[20]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.