A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels

Ever-increasing demands for main memory bandwidth and memory speed/power tradeoff led to the introduction of memories with multiple memory channels, such as Wide IO DRAM. Efficient utilization of a multichannel memory as a shared resource in multiprocessor real-time systems depends on mapping of the memory clients to the memory channels according to their requirements on latency, bandwidth, communication, and memory capacity. However, there is currently no real-time memory controller for multichannel memories, and there is no methodology to optimally configure multichannel memories in real-time systems. As a first work toward this direction, we present two main contributions in this article: (1) a configurable real-time multichannel memory controller architecture with a novel method for logical-to-physical address translation and (2) two design-time methods to map memory clients to the memory channels, one an optimal algorithm based on an integer programming formulation of the mapping problem, and the other a fast heuristic algorithm. We demonstrate the real-time guarantees on bandwidth and latency provided by our multichannel memory controller architecture by experimental evaluation. Furthermore, we compare the performance of the mapping problem formulation in a solver and the heuristic algorithm against two existing mapping algorithms in terms of computation time and mapping success ratio. We show that an optimal solution can be found in 2 hours using the solver and in less than 1 second with less than 7% mapping failure using the heuristic for realistically sized problems. Finally, we demonstrate configuring a Wide IO DRAM in a high-definition (HD) video and graphics processing system to emphasize the practical applicability and effectiveness of this work.

[1]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[2]  Kees Goossens,et al.  Memory Controllers for Real-Time Embedded Systems: Predictable and Composable Real-Time Systems , 2011 .

[3]  Rodolfo Pellizzoni,et al.  Worst Case Analysis of DRAM Latency in Multi-requestor Systems , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[4]  Kees G. W. Goossens,et al.  Dynamic Command Scheduling for Real-Time Memory Controllers , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[5]  Tao Zhang,et al.  A 3D SoC design for H.264 application with on-chip DRAM stacking , 2010, 2010 IEEE International 3D Systems Integration Conference (3DIC).

[6]  George Varghese,et al.  Efficient fair queueing using deficit round robin , 1995, SIGCOMM '95.

[7]  Björn Andersson,et al.  Bounding memory interference delay in COTS-based multi-core systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[8]  David W. Nellans,et al.  Handling the problems and opportunities posed by multiple on-chip memory controllers , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[9]  Alois Knoll,et al.  Bounding WCET of applications using SDRAM with Priority Based Budget Scheduling in MPSoCs , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Maarten Wiggers,et al.  A Priority-Based Budget Scheduler with Conservative Dataflow Model , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[11]  Pieter van der Wolf,et al.  SoC infrastructures for predictable system integration , 2011, 2011 Design, Automation & Test in Europe.

[12]  Kees G. W. Goossens,et al.  Real-Time Scheduling Using Credit-Controlled Static-Priority Arbitration , 2008, 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.

[13]  R. Shreedhar,et al.  Efficient Fair Queuing Using Deficit Round - , 1997 .

[14]  Norbert Wehn,et al.  DRAM selection and configuration for real-time mobile systems , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Shuvra S. Bhattacharyya,et al.  Embedded Multiprocessors: Scheduling and Synchronization , 2000 .

[16]  Costas Courcoubetis,et al.  Weighted Round-Robin Cell Multiplexing in a General-Purpose ATM Switch Chip , 1991, IEEE J. Sel. Areas Commun..

[17]  Zhao Zhang,et al.  Fine-grain priority scheduling on multi-channel memory systems , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[18]  Francisco J. Cazorla,et al.  Timing effects of DDR memory systems in hard real-time multicore architectures , 2013, ACM Trans. Embed. Comput. Syst..

[19]  Tomas Henriksson,et al.  Heterogeneous multi-core platform for consumer multimedia applications , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[20]  Kees G. W. Goossens,et al.  Architectures and modeling of predictable memory controllers for improved system integration , 2011, 2011 Design, Automation & Test in Europe.

[21]  Edward A. Lee,et al.  PRET DRAM controller: Bank privatization for predictability and temporal isolation , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[22]  Kees van Berkel,et al.  Multi-core for mobile phones , 2009, DATE.

[23]  Rene L. Cruz,et al.  A calculus for network delay, Part II: Network analysis , 1991, IEEE Trans. Inf. Theory.

[24]  Luca Benini,et al.  Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.

[25]  Eero Aho,et al.  A case for multi-channel memories in video recording , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[26]  Alexsandro C. Bonatto,et al.  Multichannel SDRAM controller design for H.264/AVC video decoder , 2011, 2011 VII Southern Conference on Programmable Logic (SPL).

[27]  Nong Xiao,et al.  A Scalable Multi-channel Parallel NAND Flash Memory Controller Architecture , 2011, 2011 Sixth Annual Chinagrid Conference.

[28]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1998, TNET.

[29]  Scott A. Brandt,et al.  Improving soft real-time performance through better slack reclaiming , 2005, 26th IEEE International Real-Time Systems Symposium (RTSS'05).

[30]  Kees G. W. Goossens,et al.  A reconfigurable real-time SDRAM controller for mixed time-criticality systems , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[31]  Pieter van der Wolf,et al.  Real-Time Analysis for Memory Access in Media Processing SoCs: A Practical Approach , 2008, 2008 Euromicro Conference on Real-Time Systems.

[32]  Kees G. W. Goossens,et al.  Architecture and optimal configuration of a real-time multi-channel memory controller , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[33]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[34]  Luca Benini,et al.  An efficient distributed memory interface for many-core platform with 3D stacked DRAM , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[35]  Michael Lang,et al.  Analyzing the trade-off between multiple memory controllers and memory channels on multi-core processor performance , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[36]  George A. Constantinides,et al.  Analytical synthesis of bandwidth-efficient SDRAM address generators , 2012, Microprocess. Microsystems.

[37]  George Varghese,et al.  Efficient fair queueing using deficit round-robin , 1996, TNET.

[38]  Eero Aho,et al.  Performance analysis of multi-channel memories in mobile devices , 2009, 2009 International Symposium on System-on-Chip.

[39]  Kees Goossens,et al.  Memory Controllers for Real-Time Embedded Systems , 2012 .

[40]  Peng Li,et al.  Heterogeneous multi-channel: Fine-grained DRAM control for both system performance and power efficiency , 2012, DAC Design Automation Conference 2012.

[41]  Xu Jiadong,et al.  High Efficiency Synchronous DRAM Controller for H.264 HDTV Encoder , 2007, 2007 IEEE Workshop on Signal Processing Systems.

[42]  Alejandro Rico,et al.  Interleaving granularity on high bandwidth memory architecture for CMPs , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.