A resource-efficient network interface supporting low latency reconfiguration of virtual circuits in time-division multiplexing networks-on-chip

Abstract This paper presents a resource-efficient time-division multiplexing network interface of a network-on-chip intended for use in a multicore platform for hard real-time systems. The network-on-chip provides virtual circuits to move data between core-local on-chip memories. In such a platform, a change of the application’s operating mode may require reconfiguration of virtual circuits that are setup by the network-on-chip. A unique feature of our network interface is the instantaneous reconfiguration between different time-division multiplexing schedules, containing sets of virtual circuits, without affecting virtual circuits that persist across the reconfiguration. The results show that the worst-case latency from triggering a reconfiguration until the new schedule is executing, is in the range of 300 clock cycles. Experiments show that new schedules can be transmitted from a single master to all slave nodes for a 16-core platform in between 500 and 3500 clock cycles. The results also show that the hardware cost for an FPGA implementation of our architecture is considerably smaller than other network-on-chips with similar reconfiguration functionalities, and that the worst-case time for a reconfiguration is smaller than that seen in functionally equivalent architectures.

[1]  Benoît Dupont de Dinechin,et al.  Time-critical computing on a single-chip massively parallel processor , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Jens Sparsø,et al.  A router architecture for connection-oriented service guarantees in the MANGO clockless network-on-chip , 2005, Design, Automation and Test in Europe.

[3]  Kees G. W. Goossens,et al.  The aethereal network on chip after ten years: Goals, evolution, lessons, and future , 2010, Design Automation Conference.

[4]  Rasmus Bo Sorensen,et al.  An area-efficient TDM NoC supporting reconfiguration for mode changes , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).

[5]  Kees G. W. Goossens,et al.  CoMPSoC: A template for composable and predictable multi-processor system on chips , 2009, TODE.

[6]  Martin Schoeberl,et al.  Message Passing on a Time-predictable Multicore Processor , 2015, 2015 IEEE 18th International Symposium on Real-Time Distributed Computing.

[7]  Zain-ul-Abdin,et al.  Kickstarting high-performance energy-efficient manycore architectures with Epiphany , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[8]  Om Prakash Gangwal,et al.  An efficient on-chip NI offering guaranteed services, shared-memory abstraction, and flexible network configuration , 2005 .

[9]  L. Benini,et al.  /spl times/pipesCompiler: a tool for instantiating application specific networks on chip , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[10]  Rasmus Bo Sorensen,et al.  A Metaheuristic Scheduler for Time Division Multiplexed Networks-on-Chip , 2014, 2014 IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing.

[11]  Rolf Ernst,et al.  Back Suction: Service Guarantees for Latency-Sensitive On-chip Networks , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.

[12]  Martin Schoeberl,et al.  An area-efficient network interface for a TDM-based Network-on-Chip , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[13]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[14]  Fabrizio Petrini,et al.  Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.

[15]  Kees G. W. Goossens,et al.  Router Designs for an Asynchronous Time-Division-Multiplexed Network-on-Chip , 2013, 2013 Euromicro Conference on Digital System Design.

[16]  Srinivasan Murali,et al.  A Methodology for Mapping Multiple Use-Cases onto Networks on Chips , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[17]  Luca Benini,et al.  Network Interface Architecture and Design Issues , 2006 .

[18]  Martin Schoeberl,et al.  Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach , 2011, PPES.

[19]  C. Ferdinand Worst Case Execution Time Prediction by Static Program Analysis , 2004, IPDPS.

[20]  Rene L. Cruz,et al.  A calculus for network delay, Part I: Network elements in isolation , 1991, IEEE Trans. Inf. Theory.

[21]  Kees G. W. Goossens,et al.  A TDM slot allocation flow based on multipath routing in NoCs , 2011, Microprocess. Microsystems.

[22]  Martin Schoeberl,et al.  Avionics Applications on a Time-Predictable Chip-Multiprocessor , 2016, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).

[24]  Wei Zhang,et al.  A NoC Traffic Suite Based on Real Applications , 2011, 2011 IEEE Computer Society Annual Symposium on VLSI.

[25]  Kees G. W. Goossens,et al.  dAElite: A TDM NoC Supporting QoS, Multicast, and Fast Connection Set-Up , 2014, IEEE Transactions on Computers.

[26]  Kees G. W. Goossens,et al.  An efficient on-chip NI offering guaranteed services, shared-memory abstraction, and flexible network configuration , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[27]  Reinhold Heckmann,et al.  Worst case execution time prediction by static program analysis , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[28]  Sanu Mathew,et al.  A 340 mV-to-0.9 V 20.2 Tb/s Source-Synchronous Hybrid Packet/Circuit-Switched 16 × 16 Network-on-Chip in 22 nm Tri-Gate CMOS , 2014, IEEE Journal of Solid-State Circuits.

[29]  Benoît Dupont de Dinechin,et al.  Guaranteed Services of the NoC of a Manycore Processor , 2014, NoCArc '14.

[30]  Yuankun Xue,et al.  Improving NoC performance under spatio-temporal variability by runtime reconfiguration: a general mathematical framework , 2016, 2016 Tenth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).

[31]  Kees G. W. Goossens,et al.  Argo: A Real-Time Network-on-Chip Architecture With an Efficient GALS Implementation , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[32]  Alan Burns,et al.  Real-Time Systems and Programming Languages - Ada, Real-Time Java and C / Real-Time POSIX, Fourth Edition , 2009, International computer science series.

[33]  Rolf Ernst,et al.  IDAMC: A Many-Core Platform with Run-Time Monitoring for Mixed-Criticality , 2012, 2012 IEEE 14th International Symposium on High-Assurance Systems Engineering.

[34]  Jens Sparsø,et al.  The Argo NOC: Combining TDM and GALS , 2015, 2015 European Conference on Circuit Theory and Design (ECCTD).

[35]  Christian Bernard,et al.  A 477mW NoC-based digital baseband for MIMO 4G SDR , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[36]  Luca Benini,et al.  NoC synthesis flow for customized domain specific multiprocessor systems-on-chip , 2005, IEEE Transactions on Parallel and Distributed Systems.

[37]  Neil C. Audsley,et al.  Investigating Shared Memory Tree Prefetching within Multimedia NoC Architectures , 2013 .

[38]  Jens Sparsø,et al.  Argo: A Time-Elastic Time-Division-Multiplexed NOC Using Asynchronous Routers , 2014, 2014 20th IEEE International Symposium on Asynchronous Circuits and Systems.

[39]  Benedikt Huber,et al.  T-CREST: Time-predictable multi-core architecture for embedded systems , 2015, J. Syst. Archit..

[40]  Luca Benini,et al.  P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[41]  Kees G. W. Goossens,et al.  Undisrupted Quality-of-Service during Reconfiguration of Multiple Applications in Networks on Chip , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[42]  Benedikt Huber,et al.  The T-CREST approach of compiler and WCET-analysis integration , 2013, 16th IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing (ISORC 2013).

[43]  Luca Benini,et al.  ×pipesCompiler: A Tool for Instantiating Application Specific Networks on Chip , 2004, DATE.

[44]  Kees G. W. Goossens,et al.  Architecture and optimal configuration of a real-time multi-channel memory controller , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).