SMURF: A SVM-based Incremental Anti-pattern Detection Approach

In current, typical software development projects, hundreds of developers work asynchronously in space and time and may introduce anti-patterns in their software systems because of time pressure, lack of understanding, communication, and-or skills. Anti-patterns impede development and maintenance activities by making the source code more difficult to understand. Detecting anti-patterns incrementally and on subsets of a system could reduce costs, effort, and resources by allowing practitioners to identify and take into account occurrences of anti-patterns as they find them during their development and maintenance activities. Researchers have proposed approaches to detect occurrences of anti-patterns but these approaches have currently four limitations: (1) they require extensive knowledge of anti-patterns, (2) they have limited precision and recall, (3) they are not incremental, and (4) they cannot be applied on subsets of systems. To overcome these limitations, we introduce SMURF, a novel approach to detect anti-patterns, based on a machine learning technique - support vector machines - and taking into account practitioners' feedback. Indeed, through an empirical study involving three systems and four anti-patterns, we showed that the accuracy of SMURF is greater than that of DETEX and BDTEX when detecting anti-patterns occurrences. We also showed that SMURF can be applied in both intra-system and inter-system configurations. Finally, we reported that SMURF accuracy improves when using practitioners' feedback.

[1]  Demis Ballis,et al.  A Minimalist Visual Notation for Design Patterns and Antipatterns , 2008, Fifth International Conference on Information Technology: New Generations (itng 2008).

[2]  Pierre Poulin,et al.  Visualization-based analysis of quality for large-scale software systems , 2005, ASE.

[3]  Yi Zhang,et al.  Classifying Software Changes: Clean or Buggy? , 2008, IEEE Transactions on Software Engineering.

[4]  Radu Marinescu,et al.  Detection strategies: metrics-based rules for detecting design flaws , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[5]  Antonio Cerone,et al.  Enhancing ontology-based antipattern detection using Bayesian networks , 2012, Expert Syst. Appl..

[6]  Yann-Gaël Guéhéneuc,et al.  DeMIMA: A Multilayered Approach for Design Pattern Identification , 2008, IEEE Transactions on Software Engineering.

[7]  Ian Davidson,et al.  Constrained Clustering: Advances in Algorithms, Theory, and Applications , 2008 .

[8]  Houari A. Sahraoui,et al.  Détection d'anomalies utilisant un langage de règle de qualité , 2006, LMO.

[9]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Foutse Khomh,et al.  BDTEX: A GQM-based Bayesian approach for the detection of antipatterns , 2011, J. Syst. Softw..

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  Nadia Bouassida,et al.  A Metric-Based Approach for Anti-pattern Detection in UML Designs , 2011 .

[14]  Hans Burkhardt,et al.  SVM-based Relevance Feedback in Image Retrieval using Invariant Feature Histograms , 2005, MVA.

[15]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[16]  Seiji Yamada,et al.  Non-Relevance Feedback Document Retrieval based on One Class SVM and SVDD , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[17]  Yann-Gaël Guéhéneuc,et al.  DECOR: A Method for the Specification and Detection of Code and Design Smells , 2010, IEEE Transactions on Software Engineering.

[18]  Yann-Gaël Guéhéneuc,et al.  Support vector machines for anti-pattern detection , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[19]  Foutse Khomh,et al.  An Empirical Study of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Comprehension , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[20]  Demis Ballis,et al.  A Rule-based Method to Match Software Patterns Against UML Models , 2008, RULE@RDP.

[21]  Bruce F. Webster,et al.  Pitfalls of object-oriented development , 1995 .

[22]  Houari A. Sahraoui,et al.  Deviance from perfection is a better criterion than closeness to evil when identifying risky code , 2010, ASE.

[23]  Thomas J. Mowbray,et al.  AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis , 1998 .

[24]  Antonio Torralba,et al.  A Tree-Based Context Model for Object Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Lijuan Duan,et al.  A Relevance Feedback Algorithm Based on SVM Model's Dynamic Adjusting for Image Retrieval , 2007, 2007 International Conference on Computational Intelligence and Security Workshops (CISW 2007).

[26]  Foutse Khomh,et al.  An exploratory study of the impact of antipatterns on class change- and fault-proneness , 2011, Empirical Software Engineering.

[27]  Conrad Sanderson,et al.  An Efficient Alternative to SVM Based Recursive Feature Elimination with Applications in Natural Language Processing and Bioinformatics , 2006, Australian Conference on Artificial Intelligence.

[28]  David Lo,et al.  Active refinement of clone anomaly reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[29]  Yann-Gaël Guéhéneuc,et al.  Fingerprinting design patterns , 2004, 11th Working Conference on Reverse Engineering.

[30]  Stephen E. Robertson,et al.  Selecting good expansion terms for pseudo-relevance feedback , 2008, SIGIR '08.