Dataflow models for shared memory access latency analysis

Performance analysis of applications in multi-core platforms is challenging because of temporal interference while accessing shared resources. Especially, memory arbiters introduce a non-constant delay which significantly influences the execution time of a task. In this paper, we selected a priority-based budget scheduler as memory arbiter which bounds temporal interference by construction and is well suited for bursty service provision. While existing performance analysis approaches assume a constant memory access latency leading to high overestimation, we propose in this paper a conservative data flow model for this scheduler, in which the history of memory accesses is considered. In a case study with an MP3-decoder for an ARM7 processor, we show that using a constant memory access latency for the selected scheduler results in an overestimation of three order of magnitudes. Compared to simulation, the proposed data flow model shows an overestimation of less than 3% while in previous work the overestimation was up to 104%. Furthermore, the proposed approach improves the performance by about 20% compared to a time-division-multiplex scheduler.

[1]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[2]  Giorgio C. Buttazzo,et al.  Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications (Real-Time Systems Series) , 2010 .

[3]  Lothar Thiele,et al.  Design for Timing Predictability , 2004, Real-Time Systems.

[4]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1998, TNET.

[5]  Giorgio Buttazzo,et al.  Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications , 1997 .

[6]  Maarten Wiggers,et al.  Efficient buffer capacity and scheduler setting computation for soft real-time stream processing applications , 2007, SCOPES '07.

[7]  Rolf Ernst,et al.  Performance analysis for complex embedded applications , 2005, Int. J. Embed. Syst..

[8]  Erwin A. de Kock,et al.  YAPI: application modeling for signal processing systems , 2000, Proceedings 37th Design Automation Conference.

[9]  Pieter van der Wolf,et al.  Real-Time Analysis for Memory Access in Media Processing SoCs: A Practical Approach , 2008, 2008 Euromicro Conference on Real-Time Systems.

[10]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[11]  Jef L. van Meerbergen,et al.  Memory arbitration and cache management in stream-based systems , 2000, DATE '00.

[12]  Rolf Ernst,et al.  Integrated analysis of communicating tasks in MPSoCs , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[13]  Tulika Mitra,et al.  Exploring locking & partitioning for predictable shared caches on multi-cores , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[14]  Rene L. Cruz,et al.  A calculus for network delay, Part II: Network analysis , 1991, IEEE Trans. Inf. Theory.

[15]  Yongxin Zhu,et al.  Tuning SoC platforms for multimedia processing: identifying limits and tradeoffs , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[16]  Maarten Wiggers,et al.  A Priority-Based Budget Scheduler with Conservative Dataflow Model , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[17]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[18]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[19]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[20]  Tomas Henriksson,et al.  Heterogeneous multi-core platform for consumer multimedia applications , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[21]  Petru Eles,et al.  Bus Access Optimization for Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip , 2007, 28th IEEE International Real-Time Systems Symposium (RTSS 2007).

[22]  Petru Eles Predictable Implementation of Real-Time Applications on Multiprocessor Systems on Chip. , 2009 .

[23]  Peter van der Stok Dynamic and Robust Streaming in and between Connected Consumer-Electronic Devices , 2011 .

[24]  Gerard J. M. Smit,et al.  Modelling run-time arbitration by latency-rate servers in dataflow graphs , 2007, SCOPES '07.

[25]  Kees G. W. Goossens,et al.  Real-Time Scheduling Using Credit-Controlled Static-Priority Arbitration , 2008, 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.