Custom instructions with local memory elements without expensive DMA transfers

Traditionally, Instruction set extension (ISE) algorithms have treated memory and control flow as invalid operations during custom instruction identification to ensure deterministic latency of these extended instructions. In order to overcome these constraints some work has been done to incorporate local memory for custom instructions with memory operations. Such architectures have invariably relied on the expensive DMA protocol for data transfer. Cache-coherence management poses another challenge in such systems and requires additional hardware and/or software intervention. We propose a novel custom instruction architecture capable of incorporating certain types of memory and control-flow operations. Unlike existing architectures, the proposed design eliminates the need for expensive Direct Memory Access (DMA) transfers and additional cache management sub-systems, thereby saving significant time and energy. Our method is focused mainly on accelerating code segments with static variables as well as the ones allocated on the stack, which are widely prevalent in embedded applications. Experimental results show that the proposed method achieves a substantial performance gain of upto 47% over base processor implementation.

[1]  Paolo Ienne,et al.  Virtual Ways: Efficient Coherence for Architecturally Visible Storage in Automatic Instruction Set Extensions , 2010, HiPEAC.

[2]  Nikil D. Dutt,et al.  Automatic Identification of Application-Specific Functional Units with Architecturally Visible Storage , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[3]  Nikil D. Dutt,et al.  Introduction of Architecturally Visible Storage in Instruction Set Extensions , 2007, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[4]  Nikil D. Dutt,et al.  Introduction of local memory elements in instruction set extensions , 2004, Proceedings. 41st Design Automation Conference, 2004..

[5]  Paolo Ienne,et al.  Introducing control-flow inclusion to support pipelining in custom instruction set extensions , 2009, 2009 IEEE 7th Symposium on Application Specific Processors.

[6]  Yun Liang,et al.  Efficient custom instructions generation for system-level design , 2010, 2010 International Conference on Field-Programmable Technology.

[7]  Scott A. Mahlke,et al.  Automated custom instruction generation for domain-specific processor acceleration , 2005, IEEE Transactions on Computers.