Don't race the memory bus: taming the GC leadfoot

Dynamic voltage and frequency scaling (DVFS) is ubiquitous on mobile devices as a mechanism for saving energy. Reducing the clock frequency of a processor allows a corresponding reduction in power consumption, as does turning off idle cores. Garbage collection is a canonical example of the sort of memory-bound workload that best responds to such scaling. Here, we explore the impact of frequency scaling for garbage collection in a real mobile device running Android's Dalvik virtual machine, which uses a concurrent collector. By controlling the frequency of the core on which the concurrent collector thread runs we can reduce power significantly. Running established multi-threaded benchmarks shows that total processor energy can be reduced up to 30%, with end-to-end performance loss of at most 10%.

[1]  Mahmut T. Kandemir,et al.  Tuning garbage collection for reducing memory system energy in an embedded java environment , 2002, TECS.

[2]  Kamran Eshraghian,et al.  Principles of CMOS VLSI Design: A Systems Perspective , 1985 .

[3]  Ming Zhang,et al.  Where is the energy spent inside my app?: fine grained energy accounting on smartphones with Eprof , 2012, EuroSys '12.

[4]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[5]  Shirley Moore,et al.  Non-determinism and overcount on modern hardware performance counter implementations , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[6]  Xi Yang,et al.  Looking back on the language and hardware revolutions: measured power, performance, and scaling , 2011, ASPLOS XVI.

[7]  Ramesh Govindan,et al.  Estimating mobile application energy consumption using program analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[8]  Ting Cao,et al.  The Yin and Yang of power and performance for asymmetric hardware and managed software , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[9]  Paramvir Bahl,et al.  Fine-grained power modeling for smartphones using system call tracing , 2011, EuroSys '11.

[10]  Gustavo Pinto,et al.  Understanding energy behaviors of thread management constructs , 2014, OOPSLA 2014.

[11]  David F. Bacon,et al.  Parallel real-time garbage collection of multiple heaps in reconfigurable hardware , 2014, ISMM '14.

[12]  Diana Marculescu,et al.  Power efficiency of voltage scaling in multiple clock, multiple voltage cores , 2002, ICCAD 2002.

[13]  Guy E. Blelloch,et al.  A parallel, real-time garbage collector , 2001, PLDI '01.

[14]  Gernot Heiser,et al.  An Analysis of Power Consumption in a Smartphone , 2010, USENIX Annual Technical Conference.

[15]  Matthew Arnold,et al.  Adaptive optimization in the Jalapeño JVM , 2000, OOPSLA '00.

[16]  Lizy Kurian John,et al.  On the representativeness of embedded Java benchmarks , 2008, 2008 IEEE International Symposium on Workload Characterization.

[17]  Jan Vitek,et al.  A black-box approach to understanding concurrency in DaCapo , 2012, OOPSLA '12.

[18]  Tim Brecht,et al.  Controlling garbage collection and heap growth to reduce the execution time of Java applications , 2006, TOPL.

[19]  Melanie Kambadur,et al.  An experimental survey of energy management across the stack , 2014, OOPSLA.

[20]  Alain Girault,et al.  Tradeoff exploration between reliability, power consumption, and execution time for embedded systems , 2011, International Journal on Software Tools for Technology Transfer.

[21]  Shankar Balachandran,et al.  The Implications of Shared Data Synchronization Techniques on Multi-Core Energy Efficiency , 2012, HotPower.

[22]  Mahmut T. Kandemir,et al.  Energy Behavior of Java Applications from the Memory Perspective , 2001, Java Virtual Machine Research and Technology Symposium.

[23]  Amer Diwan,et al.  Wake up and smell the coffee: evaluation methodology for the 21st century , 2008, CACM.

[24]  Richard E. Jones,et al.  The Garbage Collection Handbook: The art of automatic memory management , 2011, Chapman and Hall / CRC Applied Algorithms and Data Structures Series.

[25]  Witawas Srisa-an,et al.  An energy efficient garbage collector for java embedded devices , 2005, LCTES '05.

[26]  David F. Bacon,et al.  And then there were none: a stall-free real-time garbage collector for reconfigurable hardware , 2012, PLDI.

[27]  Mahmut T. Kandemir,et al.  Adaptive Garbage Collection for Battery-Operated Environments , 2002, Java Virtual Machine Research and Technology Symposium.

[28]  Brad Calder,et al.  Discovering and Exploiting Program Phases , 2003, IEEE Micro.

[29]  Tony Printezis,et al.  On measuring garbage collection responsiveness , 2006, Sci. Comput. Program..

[30]  John Kubiatowicz,et al.  GPUs as an opportunity for offloading garbage collection , 2012, ISMM '12.

[31]  Lieven Eeckhout,et al.  Exploring multi-threaded Java application performance on multicore hardware , 2012, OOPSLA '12.

[32]  Yuanyuan Zhou,et al.  Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures , 2007, SIGMETRICS '07.

[33]  Gernot Heiser,et al.  Unifying DVFS and offlining in mobile multicores , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[34]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[35]  Ragunathan Rajkumar,et al.  Critical power slope: understanding the runtime effects of frequency scaling , 2002, ICS '02.

[36]  Eric Rotenberg,et al.  Virtual simple architecture (VISA): exceeding the complexity limit in safe real-time systems , 2003, ISCA '03.

[37]  Peter Martini,et al.  Automatic estimation of performance requirements for software tasks of mobile devices , 2011, ICPE '11.

[38]  Urs Hölzle,et al.  A Study of the Allocation Behavior of the SPECjvm98 Java Benchmark , 1999, ECOOP.