Improving Performance by Genetically Optimizing Support Vector Machine to Detect Code Smells

Code smell occurrence is a common phenomenon when functionalities of complex systems are subjected to frequent changes. They are design anomalies that prevent future alterations, lead to errors, and make software frail. Appropriate refactoring is employed to treat them, but the first challenge in this process is identifying and detecting code smells. Despite the excellent performance of available tools, their limitations prevent developers from their adoption. Machine learning is employed to eliminate the limitations of tools and classify source code stinking with smells. In this study, the support vector machine in four different kernel settings can detect six variations of code smells. To improve performance, SVM is optimised by tuning its hyperparameters using a genetic algorithm. A genetic algorithm eliminates the need for an expert to determine the best value for hyperparameters and tests an extensive range of values that is impossible with the grid search. Accuracies are compared before and after tuning hyperparameters of the support vector classifier. Results show hikes in accuracy for all cases, with a maximum value of 40%. SVM with RBF kernel performed best among others and had the highest accuracy, up to 97.5%.