A Survey of Software Log Instrumentation

Log messages have been used widely in many software systems for a variety of purposes during software development and field operation. There are two phases in software logging: log instrumentation and log management. Log instrumentation refers to the practice that developers insert logging code into source code to record runtime information. Log management refers to the practice that operators collect the generated log messages and conduct data analysis techniques to provide valuable insights of runtime behavior. There are many open source and commercial log management tools available. However, their effectiveness highly depends on the quality of the instrumented logging code, as log messages generated by high-quality logging code can greatly ease the process of various log analysis tasks (e.g., monitoring, failure diagnosis, and auditing). Hence, in this article, we conducted a systematic survey on state-of-the-art research on log instrumentation by studying 69 papers between 1997 and 2019. In particular, we have focused on the challenges and proposed solutions used in the three steps of log instrumentation: (1) logging approach; (2) logging utility integration; and (3) logging code composition. This survey will be useful to DevOps practitioners and researchers who are interested in software logging.

[1]  Daniel Le Métayer,et al.  Designing Log Architectures for Legal Evidence , 2010, 2010 8th IEEE International Conference on Software Engineering and Formal Methods.

[2]  Heng Li,et al.  Which log level should developers choose for a new logging statement? , 2017, Empirical Software Engineering.

[3]  Philip Levis,et al.  Usenix Association 8th Usenix Symposium on Operating Systems Design and Implementation 323 Quanto: Tracking Energy in Networked Embedded Systems , 2022 .

[4]  Yves Le Traon,et al.  Usage and testability of AOP: An empirical study of AspectJ , 2013, Inf. Softw. Technol..

[5]  John K. Ousterhout,et al.  NanoLog: A Nanosecond Scale Logging System , 2018, USENIX Annual Technical Conference.

[6]  Qiang Fu,et al.  Where do developers log? an empirical study on logging practices in industry , 2014, ICSE Companion.

[7]  Laurie A. Williams,et al.  To log, or not to log: using heuristics to identify mandatory log events – a controlled experiment , 2017, Empirical Software Engineering.

[8]  Amin Vahdat,et al.  Pip: Detecting the Unexpected in Distributed Systems , 2006, NSDI.

[9]  Xiaodong Liu,et al.  SMARTLOG: Place error log statement by deep understanding of log intention , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[10]  Sooyong Park,et al.  An Automatic Approach to Validating Log Levels in Java , 2018, 2018 25th Asia-Pacific Software Engineering Conference (APSEC).

[11]  Miroslaw Malek,et al.  Comprehensive logfiles for autonomic systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[12]  Cor-Paul Bezemer,et al.  Examining the Stability of Logging Statements , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[13]  Zhenbang Chen,et al.  MTracer: A Trace-Oriented Monitoring Framework for Medium-Scale Distributed Systems , 2014, 2014 IEEE 8th International Symposium on Service Oriented System Engineering.

[14]  Xiangke Liao,et al.  Guiding log revisions by learning from software evolution history , 2019, Empirical Software Engineering.

[15]  Frederick D. Lipman,et al.  Summary of Sarbanes-Oxley Act of 2002 , 2012 .

[16]  Zhenchang Xing,et al.  Which Variables Should I Log? , 2021, IEEE Transactions on Software Engineering.

[17]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[18]  Nicholas Nethercote,et al.  Dynamic Binary Analysis and Instrumentation , 2004 .

[19]  Domenico Cotroneo,et al.  Event Logging in an Industrial Development Process: Practices and Reengineering Challenges , 2014, 2014 IEEE International Symposium on Software Reliability Engineering Workshops.

[20]  Michael J. Freedman,et al.  Experiences with Tracing Causality in Networked Services , 2010, INM/WREN.

[21]  Ying Li,et al.  Machine Deserves Better Logging: A Log Enhancement Approach for Automatic Fault Diagnosis , 2018, 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).

[22]  Shilin He,et al.  Characterizing the Natural Language Descriptions in Software Logging Statements , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[23]  Sven Apel,et al.  How AspectJ is Used: An Analysis of Eleven AspectJ Programs , 2010, J. Object Technol..

[24]  Yu Luo,et al.  lprof: A Non-intrusive Request Flow Profiler for Distributed Systems , 2014, OSDI.

[25]  Michael W. Godfrey,et al.  An Exploratory Study of the Evolution of Communicated Information about the Execution of Large Software Systems , 2011, 2011 18th Working Conference on Reverse Engineering.

[26]  Jinqiu Yang,et al.  DLFinder: Characterizing and Detecting Duplicate Logging Code Smells , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[27]  Donald Beaver,et al.  Dapper, a Large-Scale Distributed Systems Tracing Infrastructure , 2010 .

[28]  Zibin Zheng,et al.  Tools and Benchmarks for Automated Log Parsing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[29]  Gilbert Hamann,et al.  Automatic identification of load testing problems , 2008, 2008 IEEE International Conference on Software Maintenance.

[30]  Ding Yuan,et al.  Improving Software Diagnosability via Log Enhancement , 2012, TOCS.

[31]  Andrei Toma,et al.  Log4Perf: Suggesting Logging Locations for Web-based Systems' Performance Monitoring , 2018, ICPE.

[32]  Rodrigo Fonseca,et al.  Principled workflow-centric tracing of distributed systems , 2016, SoCC.

[33]  Mark Marron,et al.  Log++ logging for a cloud-native world , 2018, DLS.

[34]  Zhen Ming Jiang,et al.  Characterizing and Detecting Anti-Patterns in the Logging Code , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[35]  Gilbert Hamann,et al.  Automated performance analysis of load tests , 2009, 2009 IEEE International Conference on Software Maintenance.

[36]  Nikolaos Tsantalis,et al.  Studying and detecting log-related issues , 2018, Empirical Software Engineering.

[37]  Luca Aceto,et al.  A Survey of Runtime Monitoring Instrumentation Techniques , 2017, PrePost@iFM.

[38]  Yu Luo,et al.  Log20: Fully Automated Optimal Placement of Log Printing Statements under Specified Overhead Threshold , 2017, SOSP.

[39]  Laurie A. Williams,et al.  Enabling forensics by proposing heuristics to identify mandatory log events , 2015, HotSoS.

[40]  Arie van Deursen,et al.  Contemporary Software Monitoring: A Systematic Literature Review , 2019, ArXiv.

[41]  David A. Patterson,et al.  Path-Based Failure and Evolution Management , 2004, NSDI.

[42]  Dejan S. Milojicic,et al.  A Manifesto for Future Generation Cloud Computing: Research Directions for the Next Decade , 2018 .

[43]  Alan L. Cox,et al.  Whodunit: transactional profiling for multi-tier applications , 2007, EuroSys '07.

[44]  Tao Xie,et al.  An Exploratory Study of Logging Configuration Practice in Java , 2019, 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[45]  Randy H. Katz,et al.  A Graphical Representation for Identifier Structure in Logs , 2010, SLAML.

[46]  Qiang Fu,et al.  Log2: A Cost-Aware Logging Mechanism for Performance Diagnosis , 2015, USENIX Annual Technical Conference.

[47]  Mauricio A. Saca Refactoring improving the design of existing code , 2017, 2017 IEEE 37th Central America and Panama Convention (CONCAPAN XXXVII).

[48]  Michael Anthony Bauer,et al.  Making distributed applications manageable through instrumentation , 1997, Proceedings of PDSE '97: 2nd International Workshop on Software Engineering for Parallel and Distributed Systems.

[49]  Yu Luo,et al.  The Game of Twenty Questions: Do You Know Where to Log? , 2017, HotOS.

[50]  Tim Menzies,et al.  Software Engineering’s Top Topics, Trends, and Researchers , 2018, IEEE Software.

[51]  Arie van Deursen,et al.  Isolating idiomatic crosscutting concerns , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[52]  Julio César López-Hernández,et al.  Stardust: tracking activity in a distributed storage system , 2006, SIGMETRICS '06/Performance '06.

[53]  Ding Yuan,et al.  Characterizing logging practices in open-source software , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[54]  Boyuan Chen,et al.  Studying the Use of Java Logging Utilities in the Wild , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[55]  Guoping Rong,et al.  A Systematic Review of Logging Practice in Software Engineering , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[56]  Domenico Cotroneo,et al.  Event Logs for the Analysis of Software Failures: A Rule-Based Approach , 2013, IEEE Transactions on Software Engineering.

[57]  Jinfu Chen,et al.  Studying the characteristics of logging practices in mobile apps: a case study on F-Droid , 2019, Empirical Software Engineering.

[58]  Wei Xu,et al.  System Problem Detection by Mining Console Logs , 2010 .

[59]  Ingrid Nunes,et al.  On the Practical Feasibility of Software Monitoring: a Framework for Low-Impact Execution Tracing , 2019, 2019 IEEE/ACM 14th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS).

[60]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[61]  Zhen Ming Jiang,et al.  Characterizing logging practices in Java-based open source software projects – a replication study in Apache Software Foundation , 2016, Empirical Software Engineering.

[62]  Rachel Harrison,et al.  An exploratory study of the effect of aspect-oriented programming on maintainability , 2008, Software Quality Journal.

[63]  Fabio Petrillo,et al.  Software Configuration Engineering in Practice Interviews, Survey, and Systematic Literature Review , 2020, IEEE Transactions on Software Engineering.

[64]  Rodrigo Fonseca,et al.  Pivot tracing , 2018, USENIX ATC.

[65]  Kaushik Veeraraghavan,et al.  Canopy: An End-to-End Performance Tracing And Analysis System , 2017, SOSP.

[66]  Tibor Gyimóthy,et al.  Code coverage differences of Java bytecode and source code instrumentation tools , 2019, Software Quality Journal.

[67]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[68]  Chris Phillips,et al.  Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management , 2012 .

[69]  Joseph L. Hellerstein,et al.  ETE: a customizable approach to measuring end-to-end response times and their components in distributed systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[70]  Jimmy J. Lin,et al.  Scaling big data mining infrastructure: the twitter experience , 2013, SKDD.

[71]  Juergen Dingel,et al.  A model-based architecture for interactive run-time monitoring , 2020, Software and Systems Modeling.

[72]  Wei Xu,et al.  Advances and challenges in log analysis , 2011, Commun. ACM.

[73]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[74]  Domenico Cotroneo,et al.  Assessing Direct Monitoring Techniques to Analyze Failures of Critical Industrial Systems , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.

[75]  Heng Li,et al.  Studying software logging using topic models , 2018, Empirical Software Engineering.

[76]  Ricardo Bianchini,et al.  Striking a new balance between program instrumentation and debugging time , 2011, EuroSys '11.

[77]  Cor-Paul Bezemer,et al.  Logging Library Migrations: A Case Study for the Apache Software Foundation Projects , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[78]  Domenico Cotroneo,et al.  Industry Practices and Event Logging: Assessment of a Critical Software Development Process , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[79]  Boyuan Chen,et al.  Improving the Software Logging Practices in DevOps , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[80]  Ding Yuan,et al.  SherLog: error diagnosis by connecting clues from run-time logs , 2010, ASPLOS XV.

[81]  Neetu Sardana,et al.  LogOptPlus: Learning to Optimize Logging in Catch and If Programming Constructs , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[82]  Boyuan Chen,et al.  Extracting and studying the Logging-Code-Issue- Introducing changes in Java-based large-scale open source software systems , 2019, Empirical Software Engineering.

[83]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[84]  Jian Li,et al.  An Evaluation Study on Log Parsing and Its Use in Log Mining , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[85]  Sylvain Hallé,et al.  Automated Bug Finding in Video Games: A Case Study for Runtime Monitoring , 2014, ICST.

[86]  Steven M. Drucker,et al.  The Bones of the System: A Case Study of Logging and Telemetry at Microsoft , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[87]  Ashish Sureka,et al.  LogOpt: Static Feature Extraction from Source Code for Automated Catch Block Logging Prediction , 2016, ISEC.

[88]  Patrick Martin,et al.  Assisting developers of Big Data Analytics Applications when deploying on Hadoop clouds , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[89]  Jian Song,et al.  An Automated Approach to Estimating Code Coverage Measures via Execution Logs , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[90]  Andy Zaidman,et al.  Analyzing the State of Static Analysis: A Large-Scale Evaluation in Open Source Software , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[91]  Ralf Lämmel,et al.  Large-scale, AST-based API-usage analysis of open-source Java projects , 2011, SAC.

[92]  Tommi Mikkonen,et al.  Run-time monitoring of architecturally significant behaviors using behavioral profiles and aspects , 2006, ISSTA '06.

[93]  Qiang Fu,et al.  Learning to Log: Helping Developers Make Informed Logging Decisions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[94]  Hua Chen,et al.  Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis , 2015, SIGCOMM.

[95]  Richard Mortier,et al.  Magpie: Online Modelling and Performance-aware Systems , 2003, HotOS.

[96]  Lionel C. Briand,et al.  Instrumenting contracts with aspect-oriented programming to increase observability and support debugging , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[97]  Gilbert Hamann,et al.  An automated approach for abstracting execution logs to execution events , 2008, J. Softw. Maintenance Res. Pract..

[98]  Gene Tsudik,et al.  A new approach to secure logging , 2008, TOS.

[99]  Yang Liu,et al.  Be conservative: enhancing failure diagnosis with proactive logging , 2012, OSDI 2012.

[100]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[101]  Dan Ding,et al.  Fault Analysis and Debugging of Microservice Systems: Industrial Survey, Benchmark System, and Empirical Study , 2018, IEEE Transactions on Software Engineering.

[102]  Christian Brecher,et al.  Applying Runtime Monitoring to the Industrial Internet of Things , 2019, 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA).

[103]  Gregory R. Ganger,et al.  Diagnosing Performance Changes by Comparing Request Flows , 2011, NSDI.

[104]  Domenico Cotroneo,et al.  Assessing and improving the effectiveness of logs for the analysis of software faults , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[105]  Ying Zou,et al.  Towards just-in-time suggestions for log changes , 2016, Empirical Software Engineering.

[106]  Gregor Kiczales,et al.  Aspect-oriented programming , 2001, ESEC/FSE-9.

[107]  A. Pecchia,et al.  A Logging Approach for Effective Dependability Evaluation of Complex Systems , 2009, 2009 Second International Conference on Dependability.