Workload prediction for adaptive power scaling using deep learning

We apply hierarchical sparse coding, a form of deep learning, to model user-driven workloads based on on-chip hardware performance counters. We then predict periods of low instruction throughput, during which frequency and voltage can be scaled to reclaim power. Using a multi-layer coding structure, our method progressively codes counter values in terms of a few prominent features learned from data, and passes them to a Support Vector Machine (SVM) classifier where they act as signatures for predicting future workload states. We show that prediction accuracy and look-ahead range improve significantly over linear regression modeling, giving more time to adjust power management settings. Our method relies on learning and feature extraction algorithms that can discover and exploit hidden statistical invariances specific to workloads. We argue that, in addition to achieving superior prediction performance, our method is fast enough for practical use. To our knowledge, we are the first to use deep learning at the instruction level for workload prediction and on-chip power adaptation.

[1]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[2]  Gokhan Memik,et al.  Into the wild: Studying real user activity patterns to guide power optimizations for mobile architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  James E. Smith,et al.  Comparing Program Phase Detection Techniques , 2003, MICRO.

[4]  Reza Zamani,et al.  Adaptive estimation and prediction of power and performance in high performance computing , 2010, Computer Science - Research and Development.

[5]  Lixin Zhang,et al.  Moby: A mobile benchmark suite for architectural simulators , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[6]  Ronald G. Dreslinski,et al.  Full-system analysis and characterization of interactive smartphone applications , 2011, 2011 IEEE International Symposium on Workload Characterization (IISWC).

[7]  Gilberto Contreras,et al.  Power prediction for Intel XScale processors using performance monitoring unit events , 2005 .

[8]  Margaret Martonosi,et al.  Power prediction for Intel XScale/spl reg/ processors using performance monitoring unit events , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[9]  Sally A. McKee,et al.  Real time power estimation and thread scheduling via performance counters , 2009, CARN.

[10]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[11]  H. T. Kung,et al.  Stable and Efficient Representation Learning with Nonnegativity Constraints , 2014, ICML.

[12]  H. T. Kung,et al.  Output compression for IC fault detection using compressive sensing , 2012, MILCOM 2012 - 2012 IEEE Military Communications Conference.

[13]  H. T. Kung,et al.  Computing Sparse Representations in O(N log N) Time , 2013 .

[14]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[15]  Avi Septimus,et al.  Compressive sampling hardware reconstruction , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[16]  Karthick Rajamani,et al.  Application-Aware Power Management , 2006, 2006 IEEE International Symposium on Workload Characterization.

[17]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.