Process-variation-aware mapping of best-effort and real-time streaming applications to MPSoCs

As technology scales, the impact of process variation on the maximum supported frequency (FMAX) of individual cores in a multiprocessor system-on-chip (MPSoC) becomes more pronounced. Task allocation without variation-aware performance analysis can greatly compromise performance and lead to a significant loss in yield, defined as the percentage of manufactured chips satisfying the application timing requirement. We propose variation-aware task allocation for best-effort and real-time streaming applications modeled as task graphs. Our solutions are primarily based on the throughput requirement, which is the most important timing requirement in many real-time streaming applications. The four main contributions of this work are (1) distinguishing best-effort firm real-time and soft real-time application classes, which require different optimization criteria, (2) using dataflow graphs, which are well suited for modeling and analysis of streaming applications, we explicitly model task execution both in terms of clock cycles (which is independent of variation) and seconds (which does depend on the variation of the resource), which we connect by an explicit binding, (3) we present two optimization approaches, which give different improvement results at different costs, (4) we present both exhaustive and heuristic algorithms that implement the optimization approaches. Our variation-aware mapping algorithms are tested on models of seven real applications and are compared to mapping methods that are unaware of hardware variation. Our results demonstrate (1) improvements in the average performance (3% on average) for best-effort applications, and (2) for firm real-time and soft real-time applications, yield improvements of up to 27% with an average of 15%, showing the effectiveness of our approaches.

[1]  Orlando Moreira,et al.  Self-Timed Scheduling Analysis for Real-Time Applications , 2007, EURASIP J. Adv. Signal Process..

[2]  Saurabh Dighe,et al.  Within-Die Variation-Aware Dynamic-Voltage-Frequency-Scaling With Optimal Core Allocation and Thread Hopping for the 80-Core TeraFLOPS Processor , 2011, IEEE Journal of Solid-State Circuits.

[3]  Narayanan Vijaykrishnan,et al.  Variation-aware task allocation and scheduling for MPSoC , 2007, 2007 IEEE/ACM International Conference on Computer-Aided Design.

[4]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[5]  Michael I. Gordon,et al.  Language and Compiler Design for Streaming Applications , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[6]  Luca Benini,et al.  An efficient and complete approach for throughput-maximal SDF allocation and scheduling on multi-core platforms , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[7]  Taewhan Kim,et al.  Timing variation-aware task scheduling and binding for MPSoC , 2009, 2009 Asia and South Pacific Design Automation Conference.

[8]  Qiang Xu,et al.  Performance yield-driven task allocation and scheduling for MPSoCs under process variation , 2010, Design Automation Conference.

[9]  Luca Benini,et al.  Throughput Constraint for Synchronous Data Flow Graphs , 2009, CPAIOR.

[10]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[11]  Sander Stuijk,et al.  Throughput Analysis of Synchronous Data Flow Graphs , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[12]  Sander Stuijk,et al.  CA-MPSoC: An automated design flow for predictable multi-processor architectures for multiple applications , 2010, J. Syst. Archit..

[13]  Sander Stuijk,et al.  Multiprocessor Resource Allocation for Throughput-Constrained Synchronous Dataflow Graphs , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[14]  Kees G. W. Goossens,et al.  Process-variation aware mapping of real-time streaming applications to MPSoCs for improved yield , 2012, Thirteenth International Symposium on Quality Electronic Design (ISQED).

[15]  Sander Stuijk,et al.  SDF^3: SDF For Free , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[16]  Doris Schmitt-Landsiedel,et al.  The impact of intra-die device parameter variations on path delays and on the design for yield of low voltage digital circuits , 1996, ISLPED '96.

[17]  Shuvra S. Bhattacharyya,et al.  Embedded Multiprocessors: Scheduling and Synchronization , 2000 .

[18]  Sander Stuijk,et al.  Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[19]  Vivek De,et al.  Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[20]  corporateName,et al.  Design Automation Conference (DAC) , 2011 .

[21]  Paul Zuber,et al.  Variability aware modeling of SoCs: From device variations to manufactured system yield , 2009, 2009 10th International Symposium on Quality Electronic Design.

[22]  James Tschanz,et al.  Impact of Parameter Variations on Circuits and Microarchitecture , 2006, IEEE Micro.

[23]  Borivoje Nikolic,et al.  Measurement and analysis of variability in 45nm strained-Si CMOS technology , 2008, 2008 IEEE Custom Integrated Circuits Conference.

[24]  Kees G. W. Goossens,et al.  CoMPSoC: A template for composable and predictable multi-processor system on chips , 2009, TODE.

[25]  Elaheh Bozorgzadeh,et al.  Process variation aware system-level task allocation using stochastic ordering of delay distributions , 2008, 2008 IEEE/ACM International Conference on Computer-Aided Design.

[26]  James D. Meindl,et al.  Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration , 2002, IEEE J. Solid State Circuits.