Memory access micro-profiling for ASIP design

The memory subsystem is the major performance bottleneck in terms of speed and power consumption in today's digital systems. This is especially true for application specific embedded systems where power consumption due to memory traffic, memory latency and size of the on-chip caches have a significant role in overall system performance, energy efficiency and cost. There is an urgent need of tools that help designers take informed decisions about memory subsystems for embedded applications. This paper presents a novel, fine-grained memory profiling technique that provides the designer with valuable information such as the total amount of dynamic memory requirement of an application, the most heavily accessed source level data objects, the most memory intensive portions of an application etc. Such information can aid designers to take decisions about the overall memory subsystem comprising of a number of different cache levels, scratch-pad memories and main memory. It can also be used by a compiler to perform advanced compiler controlled memory assignment techniques, and by the application programmer to optimize an application. Case studies at the end of this paper demonstrate the accuracy of our profiling technique and provide some example usage scenarios of it.

[1]  Joseph A. Fisher Customized instruction-sets for embedded processors , 1999, DAC '99.

[2]  Marco Mattavelli,et al.  High-level algorithmic complexity evaluation for system design , 2003, J. Syst. Archit..

[3]  Whitfield Diffie,et al.  New Directions in Cryptography , 1976, IEEE Trans. Inf. Theory.

[4]  Kurt Keutzer,et al.  Building ASIPs: The Mescal Methodology , 2006 .

[5]  Kingshuk Karuri,et al.  Fine-grained application source code profiling for ASIP design , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[6]  Kristof Beyls,et al.  Generating cache hints for improved program efficiency , 2005, J. Syst. Archit..

[7]  Alexander V. Veidenbaum,et al.  Guest Editors' Introduction: Application-Specific Microprocessors , 2003, IEEE Des. Test Comput..

[8]  Rainer Leupers,et al.  An Executable Intermediate Representation for Retargetable Compilation and High-Level Code Optimization , 2003 .

[9]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[10]  Andreas Gerstlauer,et al.  Retargetable profiling for rapid, early system-level design space exploration , 2004, Proceedings. 41st Design Automation Conference, 2004..

[11]  Sumesh Udayakumaran,et al.  Compiler-decided dynamic memory allocation for scratch-pad based embedded systems , 2003, CASES '03.

[12]  Chaitali Chakrabarti,et al.  Interface and cache power exploration for core-based embedded system design , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).