论文信息 - Balancing On-Chip Network Latency in Multi-application Mapping for Chip-Multiprocessors

Balancing On-Chip Network Latency in Multi-application Mapping for Chip-Multiprocessors

As the number of cores continues to grow in chip multiprocessors (CMPs), application-to-core mapping algorithms that leverage the non-uniform on-chip resource access time have been receiving increasing attention. However, existing mapping methods for reducing overall packet latency cannot meet the requirement of balanced on-chip latency when multiple applications are present. In this paper, we address the looming issue of balancing minimized on-chip packet latency with performance-awareness in the multi-application mapping of CMPs. Specifically, the proposed mapping problem is formulated, its NP-completeness is proven, and an efficient heuristic-based algorithm for solving the problem is presented. Simulation results show that the proposed algorithm is able to reduce the maximum average packet latency by 10.42% and the standard deviation of packet latency by 99.65% among concurrently running applications and, at the same time, incur little degradation in the overall performance.

Massoud Pedram | Lizhong Chen | Timothy Mark Pinkston | Di Zhu | Siyu Yue

[1] Chen Sun,et al. DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[2] W. Dally,et al. Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[3] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .

[4] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.

[5] Xin-She Yang,et al. Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[6] Srinivasan Murali,et al. Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[7] Radu Marculescu,et al. Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, ASP-DAC '03.

[8] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[9] David Z. Pan,et al. A3MAP: Architecture-Aware Analytic Mapping for Networks-on-Chip , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[10] Hans Vandierendonck,et al. Fairness Metrics for Multi-Threaded Processors , 2011, IEEE Computer Architecture Letters.

[11] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[12] Santosh G. Abraham,et al. Chip multithreading: opportunities and challenges , 2005, 11th International Symposium on High-Performance Computer Architecture.

[13] Srinivasan Murali,et al. A Methodology for Mapping Multiple Use-Cases onto Networks on Chips , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[14] Axel Jantsch,et al. Cluster-based Simulated Annealing for Mapping Cores onto 2D Mesh Networks on Chip , 2008, 2008 11th IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems.

[15] Yale N. Patt,et al. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[16] Milo M. K. Martin,et al. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[17] O. Mutlu,et al. Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS XV.

[18] Onur Mutlu,et al. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[19] Niraj K. Jha,et al. GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[20] Avi Mendelson,et al. Fairness and Throughput in Switch on Event Multithreading , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[21] Mahmut T. Kandemir,et al. Application mapping for chip multiprocessors , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[22] John Kim,et al. Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[23] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[24] Kees G. W. Goossens,et al. A unified approach to constrained mapping and routing on network-on-chip architectures , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[25] Timothy Mattson,et al. A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).