RefDetect: A Multi-Language Refactoring Detection Tool Based on String Alignment

Refactoring is performed to improve software quality while leaving the behaviour of the software unchanged. Identifying refactorings applied to a software system is an important activity that leads to a better understanding of the evolution of the software system, and several techniques have been proposed and implemented to address this issue. The vast majority of existing refactoring detection techniques are language-specific, including the accepted state of the art, RMiner, which is exclusively Java-based. Although impressive performance has been achieved to date, there is scope for improvement in refactoring detection and such improvement would enhance both refactoring research and practice. In this paper, we propose a novel, language-neutral technique to identify refactorings in commit histories. Our approach is motivated by a desire to explore the use of string alignment algorithms in refactoring detection, and to determine if such approaches are competitive with the state of the art. The proposed approach has been implemented in a tool called RefDetect, evaluated, and compared with the current state-of-the-art refactoring detection tool: RMiner. In experiments we applied RefDetect to 514 commits of 185 Java applications containing 5,058 true refactoring instances, achieving an f-score slightly better than that achieved by RMiner (87.3% vs. 86%). RefDetect clearly outperformed RMiner in method and class based refactorings, achieving f-scores respectively of 87.7% vs. 81.7% for method-level refactorings and 92.1% vs. 86.9% for class-level refactorings. To demonstrate the language-independence of RefDetect, we conducted a further study with four C++ applications, achieving high values for both precision (96.1%) and recall (94.1%). The achieved results indicate that RefDetect performs better than the current state of the art in refactoring detection and is demonstrably capable of handling different programming languages.