Application mapping for express channel-based networks-on-chip

With the emergence of many-core multiprocessor system-on-chips (MPSoCs), the on-chip networks are facing serious challenges in providing fast communication for various tasks and cores. One promising solution shown in recent studies is to add express channels to the network as shortcuts to bypass intermediate routers, thereby reducing packet latency. However, this approach also greatly changes the packet delay estimation and traffic behaviors of the network, both of which have not yet been exploited in existing mapping algorithms. In this paper, we explore the opportunities in optimizing application mapping for express channel-based on-chip networks. Specifically, we derive a new delay model for this type of networks, identify their unique characteristics, and propose an efficient heuristic mapping algorithm that increases the bypassing opportunities by reducing unnecessary turns that would otherwise impose the entire router pipeline delay to packets. Simulation results show that the proposed algorithm can achieve a 2~4X reduction in the number of turns and 10~26% reduction in the average packet delay.

[1]  Chen Sun,et al.  DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[2]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[3]  Wayne H. Wolf,et al.  TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[4]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[5]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[6]  Srinivasan Murali,et al.  Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[7]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[8]  Klaudia Frankfurter Computers And Intractability A Guide To The Theory Of Np Completeness , 2016 .

[9]  Radu Marculescu,et al.  Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, ASP-DAC '03.

[10]  Mahmut T. Kandemir,et al.  Application mapping for chip multiprocessors , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[11]  Shashi Kumar,et al.  A two-step genetic algorithm for mapping task graphs to a network on chip architecture , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..

[12]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[13]  Niraj K. Jha,et al.  Express virtual channels: towards the ideal interconnection fabric , 2007, ISCA '07.

[14]  William J. Dally,et al.  Flattened Butterfly Topology for On-Chip Networks , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[15]  Onur Mutlu,et al.  Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[16]  Radu Marculescu,et al.  Non-Stationary Traffic Analysis and Its Implications on Multicore Platform Design , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  William J. Dally,et al.  Design tradeoffs for tiled CMP on-chip networks , 2006, ICS '06.

[19]  David Z. Pan,et al.  A3MAP: Architecture-Aware Analytic Mapping for Networks-on-Chip , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[20]  Jörg Henkel,et al.  ADAM: Run-time agent-based distributed application mapping for on-chip communication , 2008, 2008 45th ACM/IEEE Design Automation Conference.