Enabling technologies for self-aware adaptive systems

Self-aware computer systems will be capable of adapting their behavior and resources thousands of times a second to automatically find the best way to accomplish a given goal despite changing environmental conditions and demands. Such a capability benefits a broad spectrum of computer systems from embedded systems to supercomputers and is particularly useful for meeting power, performance, and resource-metering challenges in mobile computing, cloud computing, multicore computing, adaptive and dynamic compilation environments, and parallel operating systems. Some of the challenges in implementing self-aware systems are a) knowing within the system what the goals of applications are and if they are meeting them, b) deciding what actions to take to help applications meet their goals, and c) developing standard techniques that generalize and can be applied to a broad range of self-aware systems. This work presents our vision for self-aware adaptive systems and proposes enabling technologies to address these three challenges. We describe a framework called Application Heartbeats that provides a general, standardized way for applications to monitor their performance and make that information available to external observers. Then, through a study of a self-optimizing synchronization library called Smartlocks, we demonstrate a powerful technique that systems can use to determine which optimization actions to take. We show that Heartbeats can be applied naturally in the context of reinforcement learning optimization strategies as a reward signal and that, using such a strategy, Smartlocks are able to significantly improve performance of applications on an important emerging class of multicore systems called asymmetric multicores.

[1]  Barton P. Miller,et al.  Fine-grained dynamic instrumentation of commodity operating system kernels , 1999, OSDI '99.

[2]  Brinkley Sprunt,et al.  Pentium 4 Performance-Monitoring Features , 2002, IEEE Micro.

[3]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  P. Dini Internet, GRID, self-adaptability and beyond: are we ready? , 2004 .

[5]  Peter F. Sweeney,et al.  Performance and environment monitoring for continuous program optimization , 2006, IBM J. Res. Dev..

[6]  David A. Patterson,et al.  Combining statistical monitoring and predictable recovery for self-management , 2004, WOSS '04.

[7]  Nancy M. Amato,et al.  A framework for adaptive algorithm selection in STAPL , 2005, PPoPP.

[8]  Onn Shehory,et al.  SHADOWS: Self-healing complex software systems , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering - Workshops.

[9]  Marco D. Santambrogio,et al.  From Reconfigurable Architectures to Self-Adaptive Autonomic Systems , 2009, 2009 International Conference on Computational Science and Engineering.

[10]  Ladan Tahvildari,et al.  Self-adaptive software: Landscape and research challenges , 2009, TAAS.

[11]  Henry Hoffmann,et al.  Application heartbeats for software performance and health , 2010, PPoPP '10.

[12]  Jean-Louis Pazat,et al.  Dynamic Adaptation for Grid Computing , 2005, EGC.

[13]  Norman P. Jouppi,et al.  Processor Power Reduction Via Single-ISA Heterogeneous Multi-Core Architectures , 2003, IEEE Computer Architecture Letters.

[14]  Salim Hariri,et al.  Quality-of-protection (QoP)-an online monitoring and self-protection mechanism , 2005, IEEE Journal on Selected Areas in Communications.

[15]  Anant Agarwal,et al.  Smartlocks: lock acquisition scheduling for self-aware synchronization , 2010, ICAC '10.

[16]  A.J. Storm,et al.  Autonomic features of the IBM DB2 universal database for linux, UNIX, and windows , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[17]  Paola Inverardi,et al.  Run-time performance management of the Siena publish/subscribe middleware , 2005, WOSP '05.

[18]  Anant Agarwal,et al.  Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling , 2009 .

[19]  Jeffrey S. Vetter,et al.  Asserting Performance Expectations , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[20]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[21]  Sathish S. Vadhiyar,et al.  Self adaptivity in Grid computing , 2005, Concurr. Pract. Exp..

[22]  Martin Rinard,et al.  Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures , 2009 .

[23]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[24]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[25]  Dilma Da Silva,et al.  Improving operating system availability with dynamic update , 2004 .

[26]  James Won-Ki Hong,et al.  The Design of an Autonomic Communication Element to Manage Future Internet Services , 2009, APNOMS.

[27]  Onn Shehory,et al.  PANACEA Towards a Self-healing Development Framework , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[28]  Michael Stumm,et al.  Online performance analysis by statistical sampling of microprocessor performance counters , 2005, ICS '05.

[29]  Dilma Da Silva,et al.  An infrastructure for multiprocessor run-time adaptation , 2002, WOSS '02.

[30]  Philipp Reinecke,et al.  Adaptivity metric and performance for restart strategies in web services reliable messaging , 2008, WOSP '08.

[31]  Naveen Sharma,et al.  Robust clustering analysis for the management of self-monitoring distributed systems , 2008, Cluster Computing.

[32]  Engin Ipek,et al.  Core fusion: accommodating software diversity in chip multiprocessors , 2007, ISCA '07.

[33]  Dilma Da Silva,et al.  K42: building a complete operating system , 2006, EuroSys.

[34]  R.W. Wisniewski,et al.  Efficient, Unified, and Scalable Performance Monitoring for Multiprocessor Operating Systems , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[35]  Daniel A. Reed,et al.  SvPablo: A multi-language architecture-independent performance analysis system , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[36]  Dilma Da Silva,et al.  System Support for Online Reconfiguration , 2003, USENIX Annual Technical Conference, General Track.