WCET Measurement-based and Extreme Value Theory Characterisation of CUDA Kernels

The massive computational power of graphics processor units (GPUs), combined with novel programming models such as CUDA, makes them attractive platforms for many parallel applications. This includes embedded and real-time applications, which, however, also have temporal constraints: computations must not only be correct but also completed on time. This poses a challenge because the characterisation of the worst-case temporal behaviour of parallel applications on GPUs is still an open problem. To address this situtation, this paper proposes a measurement-based and statistical approach for the probabilistic characterisation of the worst-case execution time of such an application.

[1]  Liliana Cucu-Grosjean,et al.  Measurement-Based Probabilistic Timing Analysis for Multi-path Programs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.

[2]  Mikhail Bautin,et al.  Graphic engine resource management , 2008, Electronic Imaging.

[3]  James H. Anderson,et al.  GPUSync: Architecture-Aware Management of GPUs for Predictable Multi-GPU Real-Time Systems , 2012 .

[4]  Shinpei Kato Implementing Open-Source CUDA Runtime , 2013 .

[5]  Tom R. Halfhill NVIDIA's Next-Generation CUDA Compute and Graphics Architecture, Code-Named Fermi, Adds Muscle for Parallel Processing , 2009 .

[6]  Richard A. Davis,et al.  The extremogram: a correlogram for extreme events , 2009, 1001.1821.

[7]  Henry Wong,et al.  Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[8]  Mark Silberstein,et al.  PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.

[9]  E. Gumbel,et al.  Statistics of extremes , 1960 .

[10]  Christian Trefftz,et al.  Computation of Voronoi diagrams using a graphics processing unit , 2008, 2008 IEEE International Conference on Electro/Information Technology.

[11]  M. Chernick A Limit Theorem for the Maximum of Autoregressive Processes with Uniform Marginal Distributions , 1981 .

[12]  J. Hüsler Extremes and related properties of random sequences and processes , 1984 .

[13]  Chang-Gun Lee,et al.  Stochastic analysis of periodic real-time systems , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..

[14]  Paul J. Northrop,et al.  Semiparametric estimation of the extremal index using block maxima , 2005 .

[15]  J. Corcoran Modelling Extremal Events for Insurance and Finance , 2002 .

[16]  Shinpei Kato,et al.  Gdev: First-Class GPU Resource Management in the Operating System , 2012, USENIX Annual Technical Conference.

[17]  Björn Andersson,et al.  Makespan Computation for GPU Threads Running on a Single Streaming Multiprocessor , 2012, 2012 24th Euromicro Conference on Real-Time Systems.

[18]  Shinpei Kato,et al.  RGEM: A Responsive GPGPU Execution Model for Runtime Engines , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[19]  Rahul Mangharam,et al.  Anytime Algorithms for GPU Architectures , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[20]  Adam Betts,et al.  Estimating the WCET of GPU-Accelerated Applications Using Hybrid Analysis , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[21]  Gabriel A. Moreno,et al.  Statistical-Based WCET Estimation and Validation , 2009, WCET.

[22]  R. M. Loynes,et al.  Extreme Values in Uniformly Mixing Stationary Stochastic Processes , 1965 .

[23]  Shinpei Kato,et al.  TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.

[24]  Tailen Hsing,et al.  On Tail Index Estimation Using Dependent Data , 1991 .

[25]  Malcolm R Leadbetter,et al.  Extremes and local dependence in stationary sequences , 1983 .

[26]  Konstantinos Bletsas,et al.  Faster makespan estimation for GPU threads on a single streaming multiprocessor , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[27]  Anthony C. Davison,et al.  Modelling Time Series Extremes , 2012 .