An Experience Report on Applying Passive Learning in a Large-Scale Payment Company

Passive learning techniques infer graph models on the behavior of a system from large trace logs. The research community has been dedicating great effort in making passive learning techniques more scalable and ready to use by industry. However, there is still a lack of empirical knowledge on the usefulness and applicability of such techniques in large scale real systems. To that aim, we conducted action research over nine months in a large payment company. Throughout this period, we iteratively applied passive learning techniques with the goal of revealing useful information to the development team. In each iteration, we discussed the findings and challenges to the expert developer of the company, and we improved our tools accordingly. In this paper, we present evidence that passive learning can indeed support development teams, a set of lessons we learned during our experience, a proposed guide to facilitate its adoption, and current research challenges.

[1]  Shahar Maoz,et al.  Have We Seen Enough Traces ? , 2015 .

[2]  Barak A. Pearlmutter,et al.  Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm , 1998, ICGI.

[3]  Joeri de Ruiter,et al.  Lessons learned in the analysis of the EMV and TLS security protocols , 2015 .

[4]  Wil M. P. van der Aalst,et al.  Application of Process Mining in Healthcare - A Case Study in a Dutch Hospital , 2008, BIOSTEC.

[5]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[6]  Shahar Maoz,et al.  Have We Seen Enough Traces? (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[8]  Yuriy Brun,et al.  Behavioral resource-aware model inference , 2014, ASE.

[9]  Frits W. Vaandrager,et al.  Applying Automata Learning to Embedded Control Software , 2015, ICFEM.

[10]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[11]  Marijn J. H. Heule,et al.  Software model synthesis using satisfiability solvers , 2012, Empirical Software Engineering.

[12]  Neil Walkinshaw,et al.  Reverse Engineering State Machines by Interactive Grammar Inference , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[13]  Yuriy Brun,et al.  Inferring models of concurrent systems from logs of their behavior with CSight , 2014, ICSE.

[14]  John Derrick,et al.  Inferring extended finite state machine models from software executions , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[15]  Thomas R. Gross,et al.  A framework for the evaluation of specification miners based on finite state machines , 2010, 2010 IEEE International Conference on Software Maintenance.

[16]  Pedro García,et al.  IDENTIFYING REGULAR LANGUAGES IN POLYNOMIAL TIME , 1993 .

[17]  Boudewijn F. van Dongen,et al.  Business process mining: An industrial application , 2007, Inf. Syst..

[18]  Frits W. Vaandrager,et al.  Combining Model Learning and Model Checking to Analyze TCP Implementations , 2016, CAV.

[19]  K. Lewin,et al.  Group decision and social change. , 1999 .

[20]  Jerome A. Feldman,et al.  On the Synthesis of Finite-State Machines from Samples of Their Behavior , 1972, IEEE Transactions on Computers.

[21]  Ian Alexander,et al.  Handbook of Action Research Participative Inquiry and Practice , 2001, Eur. J. Inf. Syst..

[22]  Wil M. P. van der Aalst,et al.  Process mining: a research agenda , 2004, Comput. Ind..

[23]  Neil Walkinshaw,et al.  Evaluation and Comparison of Inferred Regular Grammars , 2008, ICGI.

[24]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[25]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[26]  John W. Creswell,et al.  Designing and Conducting Mixed Methods Research , 2006 .

[27]  Tiziana Margaria,et al.  LearnLib: a framework for extrapolating behavioral models , 2009, International Journal on Software Tools for Technology Transfer.

[28]  Shahar Maoz,et al.  Behavioral Log Analysis with Statistical Guarantees , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[29]  Frits W. Vaandrager,et al.  Learning and Testing the Bounded Retransmission Protocol , 2012, ICGI.

[30]  Neil Walkinshaw,et al.  STAMINA: a competition to encourage the development and assessment of software model inference techniques , 2012, Empirical Software Engineering.

[31]  Siau-Cheng Khoo,et al.  QUARK: Empirical Assessment of Automaton-based Specification Miners , 2006, 2006 13th Working Conference on Reverse Engineering.

[32]  Yuriy Brun,et al.  Unifying FSM-inference algorithms through declarative specification , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[33]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[34]  Yuriy Brun,et al.  Leveraging existing instrumentation to automatically infer invariant-constrained models , 2011, ESEC/FSE '11.