An integer linear programming based approach to simultaneous memory space partitioning and data allocation for chip multiprocessors

The trends in advanced integrated circuit technologies require us to look for new ways to utilize large numbers of gates and reduce the effects of high interconnect delays. One promising research direction is chip multiprocessors that integrate multiple processors on the same die. Among the components of a chip multiprocessor, its memory subsystem is maybe the most critical one, since it shapes both power and performance characteristics of the resulting design. Motivated by this observation, this paper addresses the problem of decomposing (partitioning) on-chip memory space across parallel processors and allocating data across memory components in an integrated manner. In the most general case, the resulting memory architecture is a hybrid one, where some memory components are accessed privately, whereas the others are shared by two or more processors. The proposed approach for achieving this has two complementary components: an optimizing compiler and an ILP (integer linear programming) solver. The role of the compiler in this approach is to analyze the application code and detect the interprocess or data sharing patterns, given the loop parallelization information. The job of the ILP solver, on the other hand, is to determine the sizes of the on-chip memory components, how these memory components are shared across multiple processors in the system, and what data each component holds. In other words, we address the problem of integrated memory space partitioning and data allocation for chip multiprocessors

[1]  Stephen Richardson MPOC: A Chip Multiprocessor for Embedded Systems , 2002 .

[2]  Josep Torrellas,et al.  A Chip-Multiprocessor Architecture with Speculative Multithreading , 1999, IEEE Trans. Computers.

[3]  Kunle Olukotun,et al.  The case for a single-chip multiprocessor , 1996, ASPLOS VII.

[4]  Luiz André Barroso,et al.  Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[5]  Mahmut T. Kandemir,et al.  Dynamic on-chip memory management for chip multiprocessors , 2004, CASES '04.

[6]  Ahmed Amine Jerraya,et al.  Automatic generation of embedded memory wrapper for multiprocessor SoC , 2002, DAC '02.

[7]  Ahmed Amine Jerraya,et al.  An optimal memory allocation for application-specific multiprocessor system-on-chip , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).

[8]  Philippe Clauss Counting Solutions to Linear and Nonlinear Constraints Through Ehrhart Polynomials: Applications to Analyze and Transform Scientific Programs , 1996, International Conference on Supercomputing.

[9]  M HillPatricia,et al.  The Parma Polyhedra Library , 2008 .

[10]  Scott F. Smith,et al.  Performance of a GALS Single-Chip Multiprocessor , 2004, International Conference on Parallel and Distributed Processing Techniques and Applications.

[11]  Mahmut T. Kandemir,et al.  Organizing the last line of defense before hitting the memory wall for CMPs , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[12]  William Pugh,et al.  Constraint-based array dependence analysis , 1998, TOPL.

[13]  Sharad Malik,et al.  Precise miss analysis for program transformations with caches of arbitrary associativity , 1998, ASPLOS VIII.

[14]  William Pugh,et al.  Finding Legal Reordering Transformations Using Mappings , 1994, LCPC.

[15]  G. Edward Suh,et al.  Dynamic Partitioning of Shared Cache Memory , 2004, The Journal of Supercomputing.

[16]  Uming Ko,et al.  Energy optimization of multilevel cache architectures for RISC and CISC processors , 1998, IEEE Trans. Very Large Scale Integr. Syst..