An input-centric paradigm for program dynamic optimizations

Accurately predicting program behaviors (e.g., locality, dependency, method calling frequency) is fundamental for program optimizations and runtime adaptations. Despite decades of remarkable progress, prior studies have not systematically exploited program inputs, a deciding factor for program behaviors. Triggered by the strong and predictive correlations between program inputs and behaviors that recent studies have uncovered, this work proposes to include program inputs into the focus of program behavior analysis, cultivating a new paradigm named input-centric program behavior analysis. This new approach consists of three components, forming a three-layer pyramid. At the base is program input characterization, a component for resolving the complexity in program raw inputs and the extraction of important features. In the middle is input-behavior modeling, a component for recognizing and modeling the correlations between characterized input features and program behaviors. These two components constitute input-centric program behavior analysis, which (ideally) is able to predict the large-scope behaviors of a program's execution as soon as the execution starts. The top layer of the pyramid is input-centric adaptation, which capitalizes on the novel opportunities that the first two components create to facilitate proactive adaptation for program optimizations. By centering on program inputs, the new approach resolves a proactivity-adaptivity dilemma inherent in previous techniques. Its benefits are demonstrated through proactive dynamic optimizations and version selection, yielding significant performance improvement on a set of Java and C programs.

[1]  Chandra Krintz,et al.  Dynamic selection of application-specific garbage collectors , 2004, ISMM '04.

[2]  Nancy M. Amato,et al.  A framework for adaptive algorithm selection in STAPL , 2005, PPoPP.

[3]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[4]  Michael Franz,et al.  Continuous program optimization: A case study , 2003, TOPL.

[5]  Wei-Chung Hsu,et al.  Dynamic Profile Driven Code Version Selection , 2007 .

[6]  Feng Mao,et al.  Exploiting statistical correlations for proactive prediction of program behaviors , 2010, CGO '10.

[7]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[8]  Markus Mock,et al.  A retrospective on: "an evaluation of staged run-time optimizations in DyC" , 2004, SIGP.

[9]  Martin Hirzel,et al.  Online Phase-Adaptive Data Layout Selection , 2008, ECOOP.

[10]  Chen Ding,et al.  Program locality analysis using reuse distance , 2009, TOPL.

[11]  Charles Consel,et al.  Efficient incremental run-time specialization for free , 1999, PLDI '99.

[12]  Mikko H. Lipasti,et al.  Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[13]  Feng Mao,et al.  Cross-Input Learning and Discriminative Prediction in Evolvable Virtual Machines , 2009, 2009 International Symposium on Code Generation and Optimization.

[14]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[15]  Matthew Arnold,et al.  Online feedback-directed optimization of Java , 2002, OOPSLA '02.

[16]  J. N. Amaral,et al.  Benchmark Design for Robust Profile-Directed Optimization , 2007 .

[17]  Xiaofeng Gao,et al.  Profile-guided proactive garbage collection for locality optimization , 2006, PLDI '06.

[18]  Craig Chambers,et al.  Towards better inlining decisions using inlining trials , 1994, LFP '94.

[19]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[20]  Richard W. Vuduc,et al.  Sparsity: Optimization Framework for Sparse Matrix Kernels , 2004, Int. J. High Perform. Comput. Appl..

[21]  Dawson R. Engler,et al.  C and tcc: a language and compiler for dynamic code generation , 1999, TOPL.

[22]  Feng Mao,et al.  Modeling Relations between Inputs and Dynamic Behavior for General Programs , 2007, LCPC.

[23]  Yale N. Patt,et al.  A Comparison Of Dynamic Branch Predictors That Use Two Levels Of Branch History , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[24]  Dean M. Tullsen,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.

[25]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[26]  Mary Lou Soffa,et al.  Continuous compilation: a new approach to aggressive and adaptive code transformation , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[27]  David A. Padua,et al.  A dynamically tuned sorting library , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[28]  Brad Calder,et al.  Online performance auditing: using hot optimizations without getting burned , 2006, PLDI '06.

[29]  Martin C. Rinard,et al.  Dynamic feedback: an effective technique for adaptive computing , 1997, PLDI '97.

[30]  Michael Voss,et al.  High-level adaptive program optimization with ADAPT , 2001, PPoPP '01.

[31]  R. Wisniewski,et al.  Performance and Environment Monitoring for Whole-System Characterization and Optimization , 2004 .

[32]  Gavin Brown,et al.  Intelligent selection of application-specific garbage collectors , 2007, ISMM '07.

[33]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[34]  Chen Ding,et al.  Locality phase prediction , 2004, ASPLOS XI.

[35]  Nikola Grcevski,et al.  Java Just-in-Time Compiler and Virtual Machine Improvements for Server and Middleware Applications , 2004, Virtual Machine Research and Technology Symposium.

[36]  Vikram S. Adve,et al.  Macroscopic Data Structure Analysis and Optimization , 2005 .

[37]  M TullsenDean,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000 .

[38]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[39]  James Demmel,et al.  Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.

[40]  Dayong Gu,et al.  Phase-based adaptive recompilation in a JVM , 2008, CGO '08.

[41]  Cliff Click,et al.  The java hotspot TM server compiler , 2001 .

[42]  Franz Franchetti,et al.  SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.

[43]  Ken Kennedy,et al.  Optimizing Compilers for Modern Architectures: A Dependence-based Approach , 2001 .

[44]  Jie Chen,et al.  Analysis and approximation of optimal co-scheduling on Chip Multiprocessors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[45]  Adam Welc,et al.  Improving virtual machine performance using a cross-run profile repository , 2005, OOPSLA '05.

[46]  Prasad A. Kulkarni,et al.  Novel online profiling for virtual machines , 2010, VEE '10.

[47]  Cheng Wang,et al.  Parametric analysis for adaptive computation offloading , 2004, PLDI '04.

[48]  Feng Mao,et al.  Influence of program inputs on the selection of garbage collectors , 2009, VEE '09.