Coupling TDM NoC and DRAM controller for cost and performance optimization of real-time systems

Existing memory subsystems and TDM NoCs for real-time systems are optimized independently in terms of cost and performance by configuring their arbiters according to the bandwidth and/or latency requirements of their clients. However, when they are used in conjunction, and run in different clock domains, i.e. they are decoupled, there exists no structured methodology to select the NoC interface width and operating frequency for minimizing area and/or power consumption. Moreover, the multiple arbitration points, one in the NoC and the other in the memory subsystem, introduce additional overhead in the worst-case guaranteed latency. These makes it hard to design cost-efficient real-time systems. The three main contributions in this paper are: (1) We present a novel methodology to couple any existing TDM NoC with a realtime memory controller and compute the different NoC interface width and operating frequency combinations for minimal area and/or power consumption. (2) For two different TDM NoC types, one a packet-switched and the other circuit-switched, we show the trade-off between area and power consumption with the different NoC configurations, for different DRAM generations. (3) We compare the coupled and decoupled architectures with the two NoCs, in terms of guaranteed worst-case latency, area and power consumption by synthesizing the designs in 40 nm technology. Our experiments show that using a coupled architecture in a system consisting of 16 clients results in savings of over 44% in guaranteed latency, 18% and 17% in area, 19% and 11% in power consumption for a packet-switched and a circuit-switched TDM NoC, respectively, with different DRAM types.

[1]  Kees G. W. Goossens,et al.  Automatic Generation of Efficient Predictable Memory Patterns , 2011, 2011 IEEE 17th International Conference on Embedded and Real-Time Computing Systems and Applications.

[2]  Kees G. W. Goossens,et al.  Channel trees: Reducing latency by sharing time slots in time-multiplexed Networks on Chip , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[3]  Kees van Berkel,et al.  Multi-core for mobile phones , 2009, DATE.

[4]  Wang Yi,et al.  Memory Access Aware Mapping for Networks-on-Chip , 2011, 2011 IEEE 17th International Conference on Embedded and Real-Time Computing Systems and Applications.

[5]  Axel Jantsch,et al.  The Nostrum backbone-a communication protocol stack for Networks on Chip , 2004, 17th International Conference on VLSI Design. Proceedings..

[6]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[7]  Kees G. W. Goossens,et al.  dAElite: A TDM NoC Supporting QoS, Multicast, and Fast Connection Set-Up , 2014, IEEE Transactions on Computers.

[8]  Marco Caccamo,et al.  Memory-centric scheduling for multicore hard real-time systems , 2012, Real-Time Systems.

[9]  Axel Jantsch,et al.  Interconnect-Centric Design for Advanced SOC and NOC , 2010 .

[10]  Luca Benini,et al.  A DRAM Centric NoC Architecture and Topology Design Approach , 2011, 2011 IEEE Computer Society Annual Symposium on VLSI.

[11]  C.H. van Berkel,et al.  Multi-core for mobile phones , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[12]  Gerard J. M. Smit,et al.  Evaluation of a Connectionless NoC for a Real-Time Distributed Shared Memory Many-Core System , 2012, 2012 15th Euromicro Conference on Digital System Design.

[13]  Kees G. W. Goossens,et al.  Aelite: A flit-synchronous Network on Chip with composable and predictable services , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Martin Schoeberl,et al.  A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[15]  Pieter van der Wolf,et al.  SoC infrastructures for predictable system integration , 2011, 2011 Design, Automation & Test in Europe.

[16]  Sunggu Lee,et al.  A network congestion-aware memory subsystem for manycore , 2013, TECS.

[17]  Tomas Henriksson,et al.  Heterogeneous multi-core platform for consumer multimedia applications , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[18]  Kees G. W. Goossens,et al.  Architectures and modeling of predictable memory controllers for improved system integration , 2011, 2011 Design, Automation & Test in Europe.

[19]  Edward A. Lee,et al.  PRET DRAM controller: Bank privatization for predictability and temporal isolation , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[20]  Alan Burns,et al.  Priority Assignment for Real-Time Wormhole Communication in On-Chip Networks , 2008, 2008 Real-Time Systems Symposium.

[21]  Hannu Tenhunen,et al.  A Low-Latency and Memory-Efficient On-chip Network , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.

[22]  Santanu Dutta,et al.  Viper: A Multiprocessor SOC for Advanced Set-Top Box and Digital TV Systems , 2001, IEEE Des. Test Comput..

[23]  Henk Corporaal,et al.  Fast and accurate protocol specific bus modeling using TLM 2.0 , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[24]  Francisco J. Cazorla,et al.  Timing effects of DDR memory systems in hard real-time multicore architectures , 2013, ACM Trans. Embed. Comput. Syst..

[25]  Kees G. W. Goossens,et al.  The aethereal network on chip after ten years: Goals, evolution, lessons, and future , 2010, Design Automation Conference.

[26]  Martin Schoeberl,et al.  An area-efficient network interface for a TDM-based Network-on-Chip , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[27]  Dake Liu,et al.  SoCBUS: switched network on chip for hard real time embedded systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[28]  Luca Benini,et al.  Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.

[29]  Kees G. W. Goossens,et al.  A reconfigurable real-time SDRAM controller for mixed time-criticality systems , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[30]  Pieter van der Wolf,et al.  Real-Time Analysis for Memory Access in Media Processing SoCs: A Practical Approach , 2008, 2008 Euromicro Conference on Real-Time Systems.

[31]  David Z. Pan,et al.  Application-Aware NoC Design for Efficient SDRAM Access , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.