QUAD - A Memory Access Pattern Analyser

In this paper, we present the Quantitative Usage Analysis of Data (QUAD) tool, a sophisticated memory access tracing tool that provides a comprehensive quantitative analysis of memory access patterns of an application with the primary goal of detecting actual data dependencies at function-level. As improvements in processing performance continue to outpace improvements in memory performance, tools to understand memory access behaviors are inevitably vital for optimizing the execution of data-intensive applications on heterogeneous architectures. The tool, first in its kind, is described in detail and the benefit and the qualities of the presented tool are described on a real case study, the x264 benchmarking application.

[1]  Stamatis Vassiliadis,et al.  The Molen compiler for reconfigurable processors , 2007, TECS.

[2]  Stamatis Vassiliadis,et al.  DWARV: Delftworkbench Automated Reconfigurable VHDL Generator , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[3]  Steven G. Parker,et al.  Interactive Visualization for Memory Reference Traces , 2008, Comput. Graph. Forum.

[4]  Margaret Martonosi,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.

[5]  Luciano Lavagno,et al.  Software performance estimation strategies in a system-level design tool , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[6]  Koen Bertels,et al.  A Multipurpose Clustering Algorithm for Task Partitioning in Multicore Reconfigurable Systems , 2009, 2009 International Conference on Complex, Intelligent and Software Intensive Systems.

[7]  Scott Hauck,et al.  Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation , 2007 .

[8]  Stamatis Vassiliadis,et al.  Developing Applications for Polymorphic Processors : The Delft Workbench , 2006 .

[9]  Koen Bertels,et al.  High level quantitative interconnect estimation for Early Design Space Exploration , 2008, 2008 International Conference on Field-Programmable Technology.

[10]  Guru Venkataramani,et al.  MemTracker: An accelerator for memory debugging and monitoring , 2009, TACO.

[11]  Koen Bertels,et al.  A clustering framework for task partitioning based on function-level data usage analysis , 2009, FPGA '09.

[12]  Benjamin G. Zorn,et al.  BIT: A Tool for Instrumenting Java Bytecodes , 1997, USENIX Symposium on Internet Technologies and Systems.

[13]  Stamatis Vassiliadis,et al.  The MOLEN polymorphic processor , 2004, IEEE Transactions on Computers.

[14]  Jack J. Dongarra,et al.  Tools to aid in the analysis of memory access patterns for FORTRAN programs , 1988, Parallel Comput..

[15]  Koen Bertels,et al.  A Framework for the Automatic Generation of Instruction-Set Extensions for Reconfigurable Architectures , 2008, ARC.

[16]  Rajesh Bordawekar,et al.  Modeling optimistic concurrency using quantitative dependence analysis , 2008, PPOPP.

[17]  Yu-Kwong Kwok,et al.  On the design, control, and use of a reconfigurable heterogeneous multi-core system-on-a-chip , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[18]  Konstantin Popov,et al.  Embla - Data Dependence Profiling for Parallel Programming , 2008, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.

[19]  Edward Fredkin,et al.  Trie memory , 1960, Commun. ACM.

[20]  Seth Copen Goldstein,et al.  Mobile Memory: Improving memory locality in very large reconfigurable fabrics , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[21]  Simon C. Steely,et al.  Memory Access Profiling Tools for Alpha-based Architectures , 1998, PARA.

[22]  Paolo Giusto,et al.  Reliable estimation of execution time of embedded software , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[23]  Marco Mattavelli,et al.  High-level algorithmic complexity evaluation for system design , 2003, J. Syst. Archit..

[24]  Stamatis Vassiliadis,et al.  Hartes Toolchain Early Evaluation: Profiling, Compilation and HDL Generation , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[25]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[26]  Kingshuk Karuri,et al.  A SW performance estimation framework for early system-level-design using fine-grained instrumentation , 2006, Proceedings of the Design Automation & Test in Europe Conference.