Saving energy without defying deadlines on mobile GPU-based heterogeneous systems

With the advent of low-power programmable compute cores based on GPUs, GPU-equipped heterogeneous platforms are becoming common in a wide spectrum of industries including safety-critical domains like the automotive industry. While the suitability of GPUs for throughput oriented applications is well-accepted, their applicability for real-time applications remains an open issue. Moreover, in mobile/embedded systems, energy-efficient computing is a major concern and yet, there has been no systematic study on the energy savings that GPUs may potentially provide. In this paper, we propose an approach to utilize both the GPU and the CPU in a heterogeneous fashion to meet the deadlines of a real-time application while ensuring that we maximize the energy savings. We note that GPUs are inherently built to maximize the throughput and this poses a major challenge when deadlines must be satisfied. The problem becomes more acute when we consider the fact that GPUs are more energy efficient than CPUs and thus, a naive approach that is based on maximizing GPU utilization might easily lead to infeasible solutions from a deadline perspective.

[1]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[2]  Majid Sarrafzadeh,et al.  Energy-aware high performance computing with graphic processing units , 2008, CLUSTER 2008.

[3]  Liliana Cucu-Grosjean,et al.  Feasibility Intervals for Fixed-Priority Real-Time Scheduling on Uniform Multiprocessors , 2006, 2006 IEEE Conference on Emerging Technologies and Factory Automation.

[4]  Soonhoi Ha,et al.  Software synthesis in the ESL methodology for multicore embedded systems , 2011, 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[5]  Hyesoon Kim,et al.  An integrated GPU power and performance model , 2010, ISCA.

[6]  Shinpei Kato,et al.  Resource Sharing in GPU-Accelerated Windowing Systems , 2011, 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium.

[7]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..

[8]  J. Y. Yen An algorithm for finding shortest routes from all source nodes to a given destination in general networks , 1970 .

[9]  Björn Andersson,et al.  Makespan Computation for GPU Threads Running on a Single Streaming Multiprocessor , 2012, 2012 24th Euromicro Conference on Real-Time Systems.

[10]  David Fernández-Baca,et al.  Allocating Modules to Processors in a Distributed System , 1989, IEEE Trans. Software Eng..

[11]  Mohammad Abdullah Al Faruque,et al.  GPU-EvR: Run-time event based real-time scheduling framework on GPGPU platform , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Nam Sung Kim,et al.  GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.

[13]  Avi Mendelson,et al.  Scheduling processing of real-time data streams on heterogeneous multi-GPU systems , 2012, SYSTOR '12.

[14]  Soonhoi Ha,et al.  Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU Architectures , 2011, PPAM.

[15]  Ben H. H. Juurlink,et al.  How a single chip causes massive power bills GPUSimPow: A GPGPU power simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[16]  Roberto Di Pietro,et al.  A mechanism to enforce privacy in vehicle-to-infrastructure communication , 2008, Comput. Commun..

[17]  Siu-Ming Yiu,et al.  Acceleration of Composite Order Bilinear Pairing on Graphics Hardware , 2012, ICICS.

[18]  Jürgen Teich,et al.  Generating GPU Code from a High-Level Representation for Image Processing Kernels , 2010, Euro-Par Workshops.

[19]  Shinpei Kato,et al.  RGEM: A Responsive GPGPU Execution Model for Runtime Engines , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[20]  James H. Anderson,et al.  GPUSync: A Framework for Real-Time GPU Management , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[21]  James H. Anderson,et al.  Robust Real-Time Multiprocessor Interrupt Handling Motivated by GPUs , 2012, 2012 24th Euromicro Conference on Real-Time Systems.

[22]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[23]  Jürgen Teich,et al.  Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging , 2012, ARCS.

[24]  Joël Goossens,et al.  Feasibility intervals for multiprocessor fixed-priority scheduling of arbitrary deadline periodic systems , 2007 .

[25]  Niraj K. Jha,et al.  COSYN: Hardware-software co-synthesis of heterogeneous distributed embedded systems , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[26]  Liliana Cucu-Grosjean,et al.  On the periodic behavior of real-time schedulers on identical multiprocessor platforms , 2013, ArXiv.

[27]  Assaf Schuster,et al.  Processing data streams with hard real-time constraints on heterogeneous systems , 2011, ICS '11.

[28]  Josef Stoer,et al.  Numerische Mathematik 1 , 1989 .

[29]  Petru Eles,et al.  General purpose computing on low-power embedded GPUs: Has it come of age? , 2013, 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).

[30]  Martin Lukasiewycz,et al.  System architecture and software design for Electric Vehicles , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[31]  Song Huang,et al.  On the energy efficiency of graphics processing units for scientific computing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[32]  Adam Betts,et al.  Estimating the WCET of GPU-Accelerated Applications Using Hybrid Analysis , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[33]  Liliana Cucu-Grosjean,et al.  Feasibility Intervals for Multiprocessor Fixed-Priority Scheduling of Arbitrary Deadline Periodic Systems , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.