Continuous Monitoring of Software Services: Design and Application of the Kieker Framework

In addition to studying the construction and evolution of software services, the software engineering discipline needs to address the operation of continuously running software services. A requirement for its robust operation are means for effective monitoring of software runtime behavior. In contrast to profiling for construction activities, monitoring of operational services should only impose a small performance overhead. Furthermore, instrumentation should be non-intrusive to the business logic, as far as possible. We present the Kieker framework for monitoring software runtime behavior, e.g., internal performance or (distributed) trace data. The flexible architecture allows to replace or add framework components, including monitoring probes, analysis components, and monitoring record types shared by logging and analysis. As a non-intrusive instrumentation technique, Kieker currently employs, but is not restricted to, aspect-oriented programming. An extensive lab study evaluates and quantifies the low overhead caused by the framework components. Qualitative evaluations provided by industrial case studies demonstrate the practicality of the approach with a telecommunication customer self service and a digital photo submission service. Kieker is available as open-source software, where both the academic and industrial partners contribute to the code. Our experiment data is publicly available, allowing interested researchers to repeat and extend our lab experiments.

[1]  Stéphane Ducasse,et al.  Software Architecture Reconstruction: A Process-Oriented Taxonomy , 2009, IEEE Transactions on Software Engineering.

[2]  Arie van Deursen,et al.  Execution trace analysis through massive sequence and circular bundle views , 2008, J. Syst. Softw..

[3]  Abdelwahab Hamou-Lhadj,et al.  A survey of trace exploration tools and techniques , 2004, CASCON.

[4]  George Candea,et al.  Autonomous recovery in componentized Internet applications , 2006, Cluster Computing.

[5]  Shigeru Chiba,et al.  Load-Time Structural Reflection in Java , 2000, ECOOP.

[6]  Wilhelm Hasselbring,et al.  Engineering and Continuously Operating Self-Adaptive Software Systems: Required Design Decisions , 2009 .

[7]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[8]  Nick Mitchell,et al.  Visualizing the Execution of Java Programs , 2001, Software Visualization.

[9]  Wilhelm Hasselbring,et al.  Model Driven Performance Measurement and Assessment with MoDePeMART , 2009, MoDELS.

[10]  Jeanne Stynes,et al.  Model-Based Performance Instrumentation of Distributed Applications , 2008, DAIS.

[11]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[12]  Armando Fox,et al.  Detecting application-level failures in component-based Internet services , 2005, IEEE Transactions on Neural Networks.

[13]  Wilhelm Hasselbring,et al.  Generating Probabilistic and Intensity-Varying Workload for Web-Based Software Systems , 2008, SIPEW.

[14]  William H. Sanders,et al.  Modelling techniques and tools for computer performance evaluation , 2006, Perform. Evaluation.

[15]  John Murphy,et al.  Non-intrusive end-to-end runtime path tracing for J2EE systems , 2006, IEE Proc. Softw..

[16]  Ada Diaconescu,et al.  Automating the performance management of component-based enterprise systems through the use of redundancy , 2005, ASE '05.

[17]  David Garlan,et al.  Acme: architectural description of component-based systems , 2000 .

[18]  Carlo Ghezzi,et al.  Model evolution by run-time parameter adaptation , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[19]  Akhil Sahai,et al.  Message tracking in SOAP-based Web services , 2002, NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327).

[20]  Frank Leymann,et al.  Monitoring and Analyzing Influential Factors of Business Process Performance , 2009, 2009 IEEE International Enterprise Distributed Object Computing Conference.

[21]  David Garlan,et al.  Rainbow: architecture-based self-adaptation with reusable infrastructure , 2004 .

[22]  Lionel C. Briand,et al.  Toward the Reverse Engineering of UML Sequence Diagrams for Distributed Java Software , 2006, IEEE Transactions on Software Engineering.

[23]  Markus Dahm,et al.  Byte Code Engineering , 1999, Java-Informations-Tage.

[24]  John Murphy,et al.  Extracting Interactions in Component-Based Systems , 2008, IEEE Transactions on Software Engineering.

[25]  Iman Poernomo,et al.  QoS-Aware Model Driven Architecture through the UML and CIM , 2006, 2006 10th IEEE International Enterprise Distributed Object Computing Conference (EDOC'06).

[26]  Xiandeng Huang,et al.  A model-driven tool for performance measurement and analysis of parallel programs , 1995, HPCN Europe.

[27]  M. W. Johnson Monitoring and diagnosing application response time with ARM , 1998, Proceedings of the IEEE Third International Workshop on Systems Management.

[28]  Wilhelm Hasselbring,et al.  Kieker: continuous monitoring and on demand visualization of Java software behavior , 2008, ICSE 2008.

[29]  Terry Williams,et al.  Probability and Statistics with Reliability, Queueing and Computer Science Applications , 1983 .

[30]  Bradley R. Schmerl,et al.  Using Architectural Models at Runtime: Research Challenges , 2004, EWSA.

[31]  Alessandro Orso,et al.  InsECTJ: a generic instrumentation framework for collecting dynamic information within Eclipse , 2005, eclipse '05.

[32]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[33]  Robert V. Brill,et al.  Applied Statistics and Probability for Engineers , 2004, Technometrics.

[34]  Xiaoying Bai,et al.  Model-based monitoring and policy enforcement of services , 2009, Simul. Model. Pract. Theory.

[35]  Wilhelm Hasselbring,et al.  A Scenario-based Approach to Increasing Service Availability at Runtime Reconfiguration of Component-based Systems , 2007, 33rd EUROMICRO Conference on Software Engineering and Advanced Applications (EUROMICRO 2007).

[36]  Manish Gupta,et al.  Problem Determination Using Dependency Graphs and Run-Time Behavior Models , 2004, DSOM.

[37]  Arie van Deursen,et al.  A Systematic Survey of Program Comprehension through Dynamic Analysis , 2008, IEEE Transactions on Software Engineering.

[38]  Thorsten Keuler,et al.  Architecture compliance checking at run-time , 2009, Inf. Softw. Technol..

[39]  Tim Wright,et al.  Visualisations of execution traces (VET): an interactive plugin-based visualisation tool , 2006, AUIC.

[40]  Hong Yan,et al.  Discovering Architectures from Running Systems , 2006, IEEE Transactions on Software Engineering.

[41]  Martin P. Robillard,et al.  Efficient mapping of software system traces to architectural views , 2000, CASCON.

[42]  Wilhelm Hasselbring,et al.  An adaptation framework enabling resource-efficient operation of software systems , 2009 .

[43]  Wilhelm Hasselbring,et al.  Automatic Failure Diagnosis Support in Distributed Large-Scale Software Systems Based on Timing Behavior Anomaly Correlation , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[44]  M. Rohr,et al.  Evaluation of control flow traces in software applications for intrusion detection , 2008, 2008 IEEE International Multitopic Conference.

[45]  Richard Mortier,et al.  Magpie: Online Modelling and Performance-aware Systems , 2003, HotOS.

[46]  Virgílio A. F. Almeida,et al.  Performance by Design - Computer Capacity Planning By Example , 2004 .

[47]  Wilhelm Hasselbring,et al.  Trace-Context Sensitive Performance Profiling for Enterprise Software Applications , 2008, SIPEW.

[48]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[49]  Wilhelm Hasselbring,et al.  Workload-intensity-sensitive timing behavior analysis for distributed multi-user software systems , 2010, WOSP/SIPEW '10.

[50]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[51]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[52]  C. Murray Woodside,et al.  Interaction tree algorithms to extract effective architecture and layered performance models from traces , 2007, J. Syst. Softw..

[53]  Shigeru Chiba,et al.  Aspect-Oriented Programming Beyond Dependency Injection , 2005, ECOOP.

[54]  Hausi A. Müller,et al.  Shimba—an environment for reverse engineering Java software systems , 2001, Softw. Pract. Exp..

[55]  William G. Griswold,et al.  An Overview of AspectJ , 2001, ECOOP.

[56]  Franz Sötz,et al.  Tools for a Model-driven Instrumentation for Monitoring , 1991 .

[57]  Sebastian Abeck,et al.  ModelDriven Instrumentation for Monitoring the Quality of Web Service Compositions , 2008, 2008 12th Enterprise Distributed Object Computing Conference Workshops.

[58]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.