Characterizing and Detecting Duplicate Logging Code Smells

Software logs are widely used by developers to assist in various tasks. Despite the importance of logs, prior studies show that there is no industrial standard on how to write logging statements. Recent research on logs often only considers the appropriateness of a log as an individual item (e.g., one single logging statement); while logs are typically analyzed in tandem. In this paper, we focus on studying duplicate logging statements, which are logging statements that have the same static text message. Such duplications in the text message are potential indications of logging code smells, which may affect developers' understanding of the dynamic view of the system. We manually studied over 3K duplicate logging statements and their surrounding code in four large-scale open source systems and uncovered five patterns of duplicate logging code smells. For each instance of the problematic code smell, we contact developers in order to verify our manual study result. We integrated our manual study result and developers' feedback into our automated static analysis tool, DLFinder, which automatically detects problematic duplicate logging code smells. We evaluated DLFinder on the manually studied systems and two additional systems. In total, combining the results of DLFinder and our manual analysis, DLFinder is able to detect over 85% of the instances which were reported to developers and then fixed.

[1]  A. Hassan,et al.  An Industrial Case Study of Customizing Operational Profiles Using Log Compression , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  Yu Luo,et al.  Log20: Fully Automated Optimal Placement of Log Printing Statements under Specified Overhead Threshold , 2017, SOSP.

[3]  Ahmed E. Hassan,et al.  CacheOptimizer: helping developers configure caching frameworks for hibernate-based database-centric web applications , 2016, SIGSOFT FSE.

[4]  Gilbert Hamann,et al.  Automatic identification of load testing problems , 2008, 2008 IEEE International Conference on Software Maintenance.

[5]  Nikolaos Tsantalis,et al.  Studying and detecting log-related issues , 2018, Empirical Software Engineering.

[6]  Jinqiu Yang,et al.  DLFinder: Characterizing and Detecting Duplicate Logging Code Smells , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[7]  Qiang Fu,et al.  Learning to Log: Helping Developers Make Informed Logging Decisions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[8]  Qiang Fu,et al.  Where do developers log? an empirical study on logging practices in industry , 2014, ICSE Companion.

[9]  Jian Song,et al.  An Automated Approach to Estimating Code Coverage Measures via Execution Logs , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[10]  Shilin He,et al.  Characterizing the Natural Language Descriptions in Software Logging Statements , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[11]  Domenico Cotroneo,et al.  Industry Practices and Event Logging: Assessment of a Critical Software Development Process , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[12]  Yu Luo,et al.  Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems , 2014, OSDI.

[13]  Zhen Ming Jiang,et al.  Characterizing and Detecting Anti-Patterns in the Logging Code , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[14]  Ding Yuan,et al.  Improving Software Diagnosability via Log Enhancement , 2012, TOCS.

[15]  Cor-Paul Bezemer,et al.  Logging Library Migrations: A Case Study for the Apache Software Foundation Projects , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[16]  Ahmed E. Hassan,et al.  Understanding Log Lines Using Development Knowledge , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[17]  Andrei Toma,et al.  Log4Perf: Suggesting Logging Locations for Web-based Systems' Performance Monitoring , 2018, ICPE.