Design and evaluation of low latency interconnection networks for real-time many-core embedded systems

On-chip interconnection networks (OCINs) in many-core embedded systems consume large portions of the chip's area, cost, delay and power. In addition to competing in area, cost, and power, OCINs must feature low diameters to meet real time deadlines. To achieve these goals, designing low-latency networks and sharing network resources are essential. We explore 13 OCINs - some are new such as the Enhanced Kite and the Spidergon-Donut networks - in 64-core systems with various topologies and properties. We also derive and compare their worst case delays, longest and average distances, critical link lengths, bisection bandwidths, total link and router costs, and total arbiter powers. Results indicate that the Enhanced Kite, Kite, Spidergon-Donut and Spidergon-Donut4 stand out in best worst-case delays with the Spidergon-Donut4 additionally featuring lower link and router costs, total arbiter power, and better 2D implementation and scalability.

[1]  M. Coppola,et al.  Spidergon: a novel on-chip communication network , 2004, 2004 International Symposium on System-on-Chip, 2004. Proceedings..

[2]  William J. Dally,et al.  A delay model and speculative architecture for pipelined routers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[3]  Luca P. Carloni,et al.  Synthesis of Low Power NOC Topologies under Bandwidth Constraints , 2006 .

[4]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[5]  Fadi N. Sibai,et al.  The hyper-ring network: a cost-efficient topology for scalable multicomputers , 1998, SAC '98.

[6]  Bill Dally NOCS 2007 Keynote 1 Enabling Technology for On-Chip Interconnection Networks , 2007 .

[7]  Fadi N. Sibai Resource Sharing in Networks-on-Chip of Large Many-core Embedded Systems , 2009, 2009 International Conference on Parallel Processing Workshops.

[8]  Zeljko Zilic,et al.  A Hybrid Ring/Mesh Interconnect for Network-on-Chip Using Hierarchical Rings for Global Routing , 2007, First International Symposium on Networks-on-Chip (NOCS'07).

[9]  Mark Horowitz,et al.  Interconnect scaling implications for CAD , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).

[10]  Franco P. Preparata,et al.  The cube-connected-cycles: A versatile network for parallel computation , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[11]  Martin Hopkins,et al.  Synergistic Processing in Cell's Multicore Architecture , 2006, IEEE Micro.

[12]  William J. Dally Enabling Technology for On-Chip Interconnection Networks , 2007, First International Symposium on Networks-on-Chip (NOCS'07).

[13]  Kunle Olukotun,et al.  A Single-Chip Multiprocessor , 1997, Computer.

[14]  Axel Jantsch,et al.  Networks on chip , 2003 .

[15]  William J. Dally,et al.  Express Cubes: Improving the Performance of k-Ary n-Cube Interconnection Networks , 1989, IEEE Trans. Computers.

[16]  Rajeev Balasubramonian,et al.  Interconnect design considerations for large NUCA caches , 2007, ISCA '07.

[17]  Miltos D. Grammatikakis,et al.  Design of Cost-Efficient Interconnect Processing Units , 2008 .

[18]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[19]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[20]  Pedro López,et al.  Beyond Fat--tree: Unidirectional Load--Balanced Multistage Interconnection Network , 2008, IEEE Computer Architecture Letters.

[21]  Nicola Concer,et al.  Simulation and analysis of network on chip architectures: ring, spidergon and 2D mesh , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[22]  Michael Stumm,et al.  A performance comparison of hierarchical ring- and mesh-connected multiprocessor networks , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.