Mining understandable state machine models from embedded code

Program understanding is a time-consuming and tedious activity for software developers. Manually building abstractions from source code requires in-depth analysis of the code. Automatic extraction of such models is possible, but cannot derive meaningful abstractions that are not already contained in the code. The automated extraction even has problems to decide which aspects of the code are important and which are not. Therefore, interactive semi-automatic approaches are the compromise of choice. In this article, we describe how state machines that describe the behaviour of a function can be extracted from code. The approach includes interaction – the user decides which aspects of the identified potentially relevant information is really relevant and which is not. This helps to reduce the resulting state machines to an understandable degree. However, these state machines in their raw form have transition conditions that are very complex and thus not understandable for humans. Therefore, we also introduce a technique to reduce these guards to an understandable form. The technique is a combination of heuristic logic minimization, exploitation of infeasible paths, and using transition priorities. We evaluate the approach on industrial embedded C code, first in a case study with hundreds of extracted state machines, and then in two experiments with professional developers. The results show that the approach is highly effective in making the guards understandable, and that guards reduced by our approach and presented with priorities are easier to understand than guards without priorities. We also show that the overall approach is beneficial for program comprehension. The guard reduction approach itself is quite generic and can also be applied to other problems. We demonstrate that for the simplification of mode switch logic.

[1]  P. Tonella Reverse engineering of object oriented code , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[2]  Ashish Kumar SCHAEM: A Method to Extract Statechart Representation of FSMs , 2009, 2009 IEEE International Advance Computing Conference.

[3]  Neil Walkinshaw,et al.  Automated discovery of state transitions and their functions in source code , 2008 .

[4]  ScannielloGiuseppe,et al.  Source-code comprehension tasks supported by UML design models , 2015 .

[5]  Nancy G. Leveson,et al.  Investigating the readability of state-based formal requirements specification languages , 2002, ICSE '02.

[6]  Monica S. Lam,et al.  Automatic extraction of object-oriented component interfaces , 2002, ISSTA '02.

[7]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[8]  Heung Seok Chae,et al.  Systematic approach for constructing an understandable state machine from a contract-based specification: controlled experiments , 2014, Software & Systems Modeling.

[9]  Amnon Naamad,et al.  The STATEMATE semantics of statecharts , 1996, TSEM.

[10]  Jochen Quante,et al.  Do Dynamic Object Process Graphs Support Program Understanding? - A Controlled Experiment. , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[11]  R. Rudell,et al.  Multiple-Valued Logic Minimization for PLA Synthesis , 1986 .

[12]  Rainer Koschke,et al.  On State Machine Mining from Embedded Control Software , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[13]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[14]  Alexander Serebrenik,et al.  Extraction of state machines of legacy C code with Cpp2XMI , 2008 .

[15]  David Lorge Parnas,et al.  Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.

[16]  Eran Yahav,et al.  Static Specification Mining Using Automata-Based Abstractions , 2008, IEEE Trans. Software Eng..

[17]  Rainer Koschke,et al.  Static object trace extraction for programs with pointers , 2005, J. Syst. Softw..

[18]  Michele Lanza,et al.  I know what you did last summer: an investigation of how developers spend their time , 2015, ICPC '15.

[19]  Stéphane S. Somé,et al.  Enhancing program comprehension with recovered state models , 2002, Proceedings 10th International Workshop on Program Comprehension.

[20]  Robert E. Tarjan,et al.  Testing flow graph reducibility , 1973, J. Comput. Syst. Sci..

[21]  David Chenho Kung,et al.  On object state testing , 1994, Proceedings Eighteenth Annual International Computer Software and Applications Conference (COMPSAC 94).

[22]  S LamMonica,et al.  Automatic extraction of object-oriented component interfaces , 2002 .

[23]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[24]  Giuseppe Scanniello,et al.  Do software models based on the UML aid in source-code comprehensibility? Aggregating evidence from 12 controlled experiments , 2018, Empirical Software Engineering.

[25]  Chao Liu,et al.  Efficient mining of iterative patterns for software specification discovery , 2007, KDD '07.

[26]  Rajib Mall,et al.  Extracting finite state representation of Java programs , 2014, Software & Systems Modeling.

[27]  Alan R. Hevner,et al.  The impacts of function extraction technology on program comprehension: A controlled experiment , 2008, Inf. Softw. Technol..

[28]  Rainer Koschke,et al.  Do Extracted State Machine Models Help to Understand Embedded Software? , 2019, 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC).

[29]  Arie van Deursen,et al.  A Controlled Experiment for Program Comprehension through Trace Visualization , 2011, IEEE Transactions on Software Engineering.

[30]  Rainer Koschke,et al.  Towards Interactive Mining of Understandable State Machine Models from Embedded Software , 2018, MODELSWARD.

[31]  Gregor Snelting,et al.  Efficient path conditions in dependence graphs for software safety analysis , 2006, TSEM.

[32]  Rainer Koschke,et al.  Towards Understandable Guards of Extracted State Machines from Embedded Software , 2019, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[33]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[34]  O. Coudert,et al.  Two-level logic minimization , 2001 .

[35]  Neil Walkinshaw,et al.  Automated discovery of state transitions and their functions in source code , 2008, Softw. Test. Verification Reliab..

[36]  Sriram Vasu,et al.  SCODE: Designing and verifying functionally safe systems in conformance to IEC61508 and ISO26262 , 2018 .

[37]  Rainer Koschke,et al.  How do professional developers comprehend software? , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[38]  James C. Corbett,et al.  Bandera: extracting finite-state models from Java source code , 2000, ICSE.

[39]  David Notkin,et al.  Software reflexion models: bridging the gap between source and high-level models , 1995, SIGSOFT FSE.

[40]  Georg Trausmuth,et al.  Reengineering C/C++ Source Code by Transforming State Machines , 1998, ESPRIT ARES Workshop.

[41]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[42]  Giuseppe Scanniello,et al.  Source-code comprehension tasks supported by UML design models: Results from a controlled experiment and a differentiated replication , 2015, J. Vis. Lang. Comput..

[43]  Ali Mesbah,et al.  Inferring Hierarchical Motifs from Execution Traces , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[44]  Tao Xie,et al.  Automatic extraction of abstract-object-state machines from unit-test executions , 2006, ICSE.

[45]  Jørn Lind-Nielsen,et al.  BuDDy : A binary decision diagram package. , 1999 .

[46]  Mario Piattini,et al.  The impact of structural complexity on the understandability of UML statechart diagrams , 2010, Inf. Sci..

[47]  Jun Sun,et al.  TzuYu: Learning stateful typestates , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[48]  Marco Torchiano,et al.  On the impact of state-based model-driven development on maintainability: a family of experiments using UniMod , 2018, Empirical Software Engineering.