Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest

Logging -- used for system events and security breaches to describe more informational yet essential aspects of software features -- is pervasive. Given the high transactionality of today's software, logging effectiveness can be reduced by information overload. Log levels help alleviate this problem by correlating a priority to logs that can be later filtered. As software evolves, however, levels of logs documenting surrounding feature implementations may also require modification as features once deemed important may have decreased in urgency and vice-versa. We present an automated approach that assists developers in evolving levels of such (feature) logs. The approach, based on mining Git histories and manipulating a degree of interest (DOI) model, transforms source code to revitalize feature log levels based on the"interestingness"of the surrounding code. Built upon JGit and Mylyn, the approach is implemented as an Eclipse IDE plug-in and evaluated on 18 Java projects with $\sim$3 million lines of code and $\sim$4K log statements. Our tool successfully analyzes 99.22% of logging statements, increases log level distributions by $\sim$20%, and increases the focus of logs in bug fix contexts $\sim$83% of the time. Moreover, pull (patch) requests were integrated into large and popular open-source projects. The results indicate that the approach is promising in assisting developers in evolving feature log levels.

[1]  Mik Kersten,et al.  Using task context to improve programmer productivity , 2006, SIGSOFT '06/FSE-14.

[2]  Jinfu Chen,et al.  Studying the characteristics of logging practices in mobile apps: a case study on F-Droid , 2019, Empirical Software Engineering.

[3]  Danny Dig,et al.  Accurate and Efficient Refactoring Detection in Commit History , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[4]  Baishakhi Ray,et al.  An Empirical Study on the Use and Misuse of Java 8 Streams , 2020, FASE.

[5]  Jonathan I. Maletic,et al.  Journal of Software Maintenance and Evolution: Research and Practice Survey a Survey and Taxonomy of Approaches for Mining Software Repositories in the Context of Software Evolution , 2022 .

[6]  Qiang Fu,et al.  Where do developers log? an empirical study on logging practices in industry , 2014, ICSE Companion.

[7]  Jeannette M. Wing,et al.  A behavioral notion of subtyping , 1994, TOPL.

[8]  Shilin He,et al.  Characterizing the Natural Language Descriptions in Software Logging Statements , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[9]  Gary T. Leavens,et al.  Reasoning about object-oriented programs that use subtypes , 1990, OOPSLA/ECOOP '90.

[10]  Nikolaos Tsantalis,et al.  Studying and detecting log-related issues , 2018, Empirical Software Engineering.

[11]  Wil M. P. van der Aalst,et al.  Conformance Testing: Measuring the Fit and Appropriateness of Event Logs and Process Models , 2005, Business Process Management Workshops.

[12]  Audris Mockus,et al.  Evaluation of source code copy detection methods on freebsd , 2008, MSR '08.

[13]  Cor-Paul Bezemer,et al.  Examining the stability of logging statements , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[14]  Ding Yuan,et al.  Improving software diagnosability via log enhancement , 2012, ASPLOS XVI.

[15]  Baishakhi Ray,et al.  GitcProc: a tool for processing and classifying GitHub commits , 2017, ISSTA.

[16]  Baishakhi Ray,et al.  Automatically diagnosing and repairing error handling bugs in C , 2017, ESEC/SIGSOFT FSE.

[17]  Mik Kersten,et al.  Mylar: a degree-of-interest model for IDEs , 2005, AOSD '05.

[18]  Teng Wang,et al.  LogTracker: Learning Log Revision Behaviors Proactively from Software Evolution History , 2018, 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC).

[19]  Heng Li,et al.  Which log level should developers choose for a new logging statement? , 2017, Empirical Software Engineering.

[20]  Heng Li,et al.  Studying software logging using topic models , 2018, Empirical Software Engineering.

[21]  ApelSven,et al.  Is Static Analysis Able to Identify Unnecessary Source Code , 2020 .

[22]  Hidehiko Masuhara,et al.  Detecting Broken Pointcuts Using Structural Commonality and Degree of Interest (N) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[23]  Ying Zou,et al.  Towards just-in-time suggestions for log changes , 2016, Empirical Software Engineering.

[24]  Mohamed Wiem Mkaouer,et al.  On the classification of software change messages using multi-label active learning , 2019, SAC.

[25]  Raffi Khatchadourian,et al.  Automated refactoring of legacy Java software to enumerated types , 2007, 2007 IEEE International Conference on Software Maintenance.

[26]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.