On-chip network interfaces supporting automatic burst write creation, posted writes and read prefetch

Networks-on-Chip are seen as a scalable solution for facilitating the development of Systems-on-Chip with an increasing number of IP cores. Many studies already address the implementation details of such networks and a large effort has been invested in optimizing the routing strategy and the organization of the network, however by comparison the interface between the network and the IPs has been largely ignored. In this study, we explore optimizations that can be performed at the layer that connects the IPs to the services offered by the NoC. In our FPGA prototype, a MicroBlaze soft-core is connected to a remote memory via the Æthereal NoC. By employing our optimizations to the interface between the MicroBlaze and the NoC, we demonstrate an improvement in terms of speed above 880% in memory intensive tests and of up to 12% in real life applications with little use of communication.

[1]  Kevin Skadron,et al.  Design issues and tradeoffs for write buffers , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[2]  L. Benini,et al.  Xpipes: a network-on-chip architecture for gigascale systems-on-chip , 2004, IEEE Circuits and Systems Magazine.

[3]  Kees G. W. Goossens,et al.  An on-chip interconnect and protocol stack for multiple communication paradigms and programming models , 2009, CODES+ISSS '09.

[4]  Mahmut T. Kandemir,et al.  A compiler-directed data prefetching scheme for chip multiprocessors , 2009, PPoPP '09.

[5]  Om Prakash Gangwal,et al.  An efficient on-chip NI offering guaranteed services, shared-memory abstraction, and flexible network configuration , 2005 .

[6]  Jean-Loup Baer,et al.  A performance study of software and hardware data prefetching schemes , 1994, ISCA '94.

[7]  Kees Goossens,et al.  A Network-on-Chip monitoring infrastructure for communication-centric debug of embedded multi-processor SoCs , 2009, 2009 International Symposium on VLSI Design, Automation and Test.

[8]  Radu Marculescu,et al.  DyAD - smart routing for networks-on-chip , 2004, Proceedings. 41st Design Automation Conference, 2004..

[9]  William J. Dally,et al.  Route packets, not wires: on-chip inteconnection networks , 2001, DAC '01.

[10]  Gurindar S. Sohi 25 Years of the International Symposia on Computer Architecture (Selected Papers) , 1998, ISCA Selected Papers.

[11]  Luca Benini,et al.  NoC Design and Implementation in 65nm Technology , 2007, First International Symposium on Networks-on-Chip (NOCS'07).

[12]  Kees G. W. Goossens,et al.  CoMPSoC: A template for composable and predictable multi-processor system on chips , 2009, TODE.

[13]  Rabi N. Mahapatra,et al.  Interfacing cores with on-chip packet-switched networks , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[14]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.

[15]  Kees G. W. Goossens,et al.  The aethereal network on chip after ten years: Goals, evolution, lessons, and future , 2010, Design Automation Conference.

[16]  Shilpa Bhoj,et al.  Generic Network Interfaces for Plug and Play NoC Based Architecture , 2006, ARC.

[17]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[18]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[19]  Krishnan Srinivasan,et al.  An automated technique for topology and route generation of application specific on-chip interconnection networks , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[20]  James R. Goodman,et al.  Cache Consistency and Sequential Consistency , 1991 .

[21]  Ran Ginosar,et al.  QNoC: QoS architecture and design process for network on chip , 2004, J. Syst. Archit..

[22]  Sarita V. Adve,et al.  Shared Memory Consistency Models: A Tutorial , 1996, Computer.

[23]  Michel Dubois,et al.  Memory access buffering in multiprocessors , 1998, ISCA '98.

[24]  J.D. Day,et al.  The OSI reference model , 1983 .

[25]  Evangelos P. Markatos,et al.  Shared memory vs. message passing in shared-memory multiprocessors , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[26]  Alf Johansson,et al.  On Connecting Cores to Packet Switched On-Chip Networks: A Case Study with MicroBlaze Processor Cores , 2004 .

[27]  Pedro López,et al.  Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Network on Chips , 2007, First International Symposium on Networks-on-Chip (NOCS'07).

[28]  F. H. Mcmahon,et al.  The Livermore Fortran Kernels: A Computer Test of the Numerical Performance Range , 1986 .

[29]  Alberto L. Sangiovanni-Vincentelli,et al.  Addressing the system-on-a-chip interconnect woes through communication-based design , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[30]  Radu Marculescu,et al.  Energy- and performance-driven NoC communication architecture synthesis using a decomposition approach , 2005, Design, Automation and Test in Europe.

[31]  Mark D. Hill,et al.  Multiprocessors Should Support Simple Memory-Consistency Models , 1998, Computer.

[32]  Kees G. W. Goossens,et al.  An efficient on-chip network interface offering guaranteed services, shared-memory abstraction, and flexible network configuration , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.