A Metric-Based Approach to Detect Abstract Data Types and State Encapsulations

This article presents an approach to identify abstract data types (ADT) and abstract state encapsulations (ASE, also called abstract objects) in source code. This approach, named similarity clustering, groups together functions, types, and variables into ADT and ASE candidates according to the proportion of features they share. The set of features considered includes the context of these elements, the relationships to their environment, and informal information. A prototype tool has been implemented to support this approach. It has been applied to three C systems (each between 30–38 Kloc). The ADTs and ASEs identified by the approach are compared to those identified by software engineers who did not know the proposed approach or other automatic approaches. Within this case study, this approach has been shown to have a higher detection quality and to identify, in most of the cases, more ADTs and ASEs than the other techniques. In all other cases its detection quality is second best. N.B. This article reports on work in progress on this approach which has evolved since it was presented in the original ASE97 conference paper.

[1]  Walter Mann,et al.  Correction to "Specification and Analysis of System Architecture Using Rapide" , 1995, IEEE Trans. Software Eng..

[2]  Mary Shaw,et al.  Abstractions for Software Architecture and Tools to Support Them , 1995, IEEE Trans. Software Eng..

[3]  D. R. Harris,et al.  Recovering abstract data types and object instances from a conventional procedural language , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[4]  Aniello Cimitile,et al.  Extracting abstract data types from C programs: A case study , 1993, 1993 Conference on Software Maintenance.

[5]  Aniello Cimitile,et al.  A reverse engineering method for identifying reusable abstract data types , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[6]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[7]  Gerardo Canfora,et al.  An improved algorithm for identifying objects in code , 1996 .

[8]  Mary Shaw,et al.  An Introduction to Software Architecture , 1993, Advances in Software Engineering and Knowledge Engineering.

[9]  Thomas Bräunl,et al.  Virtual Mechanics Simulation and Animation of Rigid Body Systems with AERO , 1995, Simul..

[10]  Norman Wilde,et al.  An object finder for program structure understanding in software maintenance , 1994, J. Softw. Maintenance Res. Pract..

[11]  Jean-Francois Girard,et al.  Finding components in a hierarchy of modules: a step towards architectural understanding , 1997, 1997 Proceedings International Conference on Software Maintenance.

[12]  Ted J. Biggerstaff,et al.  Design recovery for maintenance and reuse , 1989, Computer.

[13]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[14]  Robert W. Schwanke,et al.  Using Neural Networks to Modularize Software , 1994, Machine Learning.

[15]  Stephen N. Zilles,et al.  Programming with abstract data types , 1974 .

[16]  Jean-Francois Girard,et al.  Comparison of abstract data type and abstract state encapsulation detection techniques for architectural understanding , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[17]  Alexander L. Wolf,et al.  Acm Sigsoft Software Engineering Notes Vol 17 No 4 Foundations for the Study of Software Architecture , 2022 .

[18]  Bengt Nordström Programming with abstract data types, some examples , 1978, ACM '78.

[19]  Ian Sommerville,et al.  Software engineering, 4th Edition , 1992, International computer science series.

[20]  Robert W. Schwanke,et al.  An intelligent tool for re-engineering software modularity , 1991, [1991 Proceedings] 13th International Conference on Software Engineering.

[21]  Agnar Aamodt,et al.  Explanation-Driven Case-Based Reasoning , 1993, EWCBR.

[22]  Aniello Cimitile,et al.  A precise method for identifying reusable abstract data types in code , 1994, Proceedings 1994 International Conference on Software Maintenance.

[23]  N. Wilde,et al.  Identifying objects in a conventional procedural language: an example of data design recovery , 1990, Proceedings. Conference on Software Maintenance 1990.

[24]  John V. Guttag,et al.  Abstract data types and the development of data structures , 1977, CACM.

[25]  Ian Sommerville,et al.  Software engineering (4th ed.) , 1993 .

[26]  Claude E. Shannon,et al.  The Mathematical Theory of Communication , 1950 .

[27]  Stephen N. Zilles,et al.  Programming with abstract data types , 1974, SIGPLAN Symposium on Very High Level Languages.

[28]  James R. Cordy,et al.  A Syntactic Theory of Software Architecture , 1995, IEEE Trans. Software Eng..

[29]  M. Richter Classification and Learning of Similarity Measures , 1993 .