Software similarity measurements using UML diagrams: A systematic literature review

Every piece of software uses a model to derive its operational, auxiliary, and functional procedures. Unified Modeling Language (UML) is a standard displaying language for determining, recording, and building a software product. Several algorithms have been used by researchers to measure similarities between UML artifacts. However, there no literature studies have considered measurements of UML diagram similarities. This paper presents the results of a systematic literature review concerning similarity measurements between the UML diagrams of different software products. The study reviews and identifies similarity measurements of UML artifacts, with class diagram, sequence diagram, statechart diagram, and use case diagram being UML diagrams that are widely used as research objects for measuring similarity. Measuring similarity enables resolution of the problem domains of software reuse, similarity measurement, and clone detection. The instruments used to measure similarity are semantic and structural similarity. The findings indicate opportunities for future research regarding calculating other UML diagrams, compiling calculation information for each diagram, adapting semantic and structural similarity calculation methods, determining the best weight for each item in the diagram, testing novel proposed methods, and building or finding good datasets for use as testing material.

[1]  Jernej Kovse,et al.  Generic XMI-Based UML Model Transformations , 2002, OOIS.

[2]  Albert M. Jimenez,et al.  Teacher Observation and Reliability: Additional Insights Gathered from Inter-rater Reliability Analyses , 2019, Journal of Educational Supervision.

[3]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[4]  Sivaji Bandyopadhyay,et al.  TEXTUAL ENTAILMENT USING LEXICAL AND SYNTACTIC SIMILARITY , 2011 .

[5]  Xin Wang,et al.  A Fixture Design Retrieving Method Based on Constrained Maximum Common Subgraph , 2018, IEEE Transactions on Automation Science and Engineering.

[6]  Kaspar Riesen,et al.  Improving Approximate Graph Edit Distance by Means of a Greedy Swap Strategy , 2014, ICISP.

[7]  Richard P. Honeck,et al.  Semantic similarity between sentences , 1973, Journal of psycholinguistic research.

[8]  Sabrina Marczak,et al.  A systematic literature review on agile requirements engineering practices and challenges , 2015, Comput. Hum. Behav..

[9]  Siti Rochimah,et al.  Class Diagram Similarity Measurement: A Different Approach , 2018, 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE).

[10]  T. Ohyama Statistical inference of Gwet’s AC1 coefficient for multiple raters and binary outcomes , 2020 .

[11]  Riccardo Scandariato,et al.  Threat analysis of software systems: A systematic literature review , 2018, J. Syst. Softw..

[12]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A tertiary study , 2010, Inf. Softw. Technol..

[13]  Zongmin Ma,et al.  Structural similarity measure between UML class diagrams based on UCG , 2019, Requirements Engineering.

[14]  K. Gwet Testing the Difference of Correlated Agreement Coefficients for Statistical Significance , 2016, Educational and psychological measurement.

[15]  Siti Rochimah,et al.  Activity Diagram Similarity Measurement: A Different Approach , 2018, 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI).

[16]  Matt Post,et al.  Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity , 2020, WMT.

[17]  K. Gwet,et al.  A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples , 2013, BMC Medical Research Methodology.

[18]  Juan Llorens Morillo,et al.  Towards an ontology-based retrieval of UML Class Diagrams , 2012, Inf. Softw. Technol..

[19]  Riyanarto Sarno,et al.  Developing Word Sense Disambiguation Corpuses Using Word2vec and Wu Palmer for Disambiguation , 2018, 2018 International Seminar on Application for Technology of Information and Communication.

[20]  H. Li,et al.  Measuring software similarity based on structure and property of class diagram , 2013, 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI).

[21]  Miguel Goulão,et al.  Deriving architectural models from requirements specifications: A systematic mapping study , 2019, Inf. Softw. Technol..

[22]  Moataz A. Ahmed,et al.  Similarity assessment of UML class diagrams using simulated annealing , 2014, 2014 IEEE 5th International Conference on Software Engineering and Service Science.

[23]  Daniel Siahaan,et al.  Structural and semantic similarity measurement of UML sequence diagrams , 2017, 2017 11th International Conference on Information & Communication Technology and System (ICTS).

[24]  Sanggil Kang,et al.  Semantic similarity method for keyword query system on RDF , 2014, Neurocomputing.

[25]  Ciaran McCreesh,et al.  Between Subgraph Isomorphism and Maximum Common Subgraph , 2017, AAAI.

[26]  Judit Kormos,et al.  Syntactic and lexical development in an intensive English for Academic Purposes programme , 2015 .

[27]  Christian F. Durach,et al.  Statistical and judgmental criteria for scale purification , 2017 .

[28]  Hamza Onoruoiza Salami,et al.  A framework for reuse of multi-view UML artifacts , 2014, ArXiv.

[29]  Clifton Clunie,et al.  Reuse of use cases diagrams: an approach based on ontologies and semantic web technologies , 2012 .

[30]  Moataz A. Ahmed,et al.  Similarity assessment of UML class diagrams using a greedy algorithm , 2014, 2014 International Computer Science and Engineering Conference (ICSEC).