flexfringe: A Passive Automaton Learning Package

Finite state models, such as Mealy machines or state charts, are often used to express and specify protocol and software behavior. Consequently, these models are often used in verification, testing, and for assistance in the development and maintenance process. Reverse engineering these models from execution traces and log files, in turn, can accelerate and improve the software development and inform domain experts about the processes actually executed in a system. We present name, an open-source software tool to learn variants of finite state automata from traces using a state-of-the-art evidence-driven state-merging algorithm at its core. We embrace the need for customized models and tailored learning heuristics in different application domains by providing a flexible, extensible interface.

[1]  Ferdinand Wagner,et al.  Modeling Software with Finite State Machines : A Practical Approach , 2006 .

[2]  Radu State,et al.  Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms , 2017, ArXiv.

[3]  Ariadna Quattoni,et al.  Results of the Sequence PredIction ChallengE (SPiCe): a Competition on Learning the Next Symbol in a Sequence , 2016, ICGI.

[4]  Angelos D. Keromytis,et al.  HVLearn: Automated Black-Box Analysis of Hostname Verification in SSL/TLS Implementations , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[5]  Ricard Gavaldà,et al.  Adaptively learning probabilistic deterministic automata from data streams , 2014, Machine Learning.

[6]  Yuriy Brun,et al.  Unifying FSM-inference algorithms through declarative specification , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Sicco Verwer Efficient Identification of Timed Automata: Theory and practice , 2010 .

[8]  Pedro García,et al.  IDENTIFYING REGULAR LANGUAGES IN POLYNOMIAL TIME , 1993 .

[9]  Yuriy Brun,et al.  Leveraging existing instrumentation to automatically infer invariant-constrained models , 2011, ESEC/FSE '11.

[10]  Neil Walkinshaw,et al.  STAMINA: a competition to encourage the development and assessment of software model inference techniques , 2012, Empirical Software Engineering.

[11]  Neil Walkinshaw,et al.  Reverse Engineering State Machines by Interactive Grammar Inference , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[12]  Marco Ortolani,et al.  Gl-learning: an optimized framework for grammatical inference , 2016, CompSysTech.

[13]  Christopher Krügel,et al.  Prospex: Protocol Specification Extraction , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[14]  Sicco Verwer,et al.  Complementing Model Learning with Mutation-Based Fuzzing , 2016, ArXiv.

[15]  Dawn Xiaodong Song,et al.  Inference and analysis of formal models of botnet command and control protocols , 2010, CCS '10.

[16]  Jeffrey Heinz,et al.  Topics in Grammatical Inference , 2016, Springer Berlin Heidelberg.

[17]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[18]  Qin Lin,et al.  Learning behavioral fingerprints from Netflows using Timed Automata , 2017, 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).

[19]  Radu State,et al.  Efficient Learning of Communication Profiles from IP Flow Records , 2016, 2016 IEEE 41st Conference on Local Computer Networks (LCN).

[20]  David Lee,et al.  Principles and methods of testing finite state machines-a survey , 1996, Proc. IEEE.

[21]  Sicco Verwer,et al.  Short-term Time Series Forecasting with Regression Automata , 2016 .

[22]  Qin Lin,et al.  Interpreting Finite Automata for Sequential Data , 2016, NIPS 2016.

[23]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[24]  Marijn J. H. Heule,et al.  Software model synthesis using satisfiability solvers , 2012, Empirical Software Engineering.

[25]  Helen J. Wang,et al.  Discoverer: Automatic Protocol Reverse Engineering from Network Traces , 2007, USENIX Security Symposium.

[26]  Paola Inverardi,et al.  Automatic synthesis of behavior protocols for composable web-services , 2009, ESEC/FSE '09.

[27]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[28]  Frits W. Vaandrager,et al.  Model learning , 2017, Commun. ACM.

[29]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[30]  Joeri de Ruiter,et al.  Protocol State Fuzzing of TLS Implementations , 2015, USENIX Security Symposium.

[31]  Bernhard Steffen,et al.  Learning register automata: from languages to program structures , 2014, Machine Learning.

[32]  Rémi Eyraud,et al.  Sp2Learn: A Toolbox for the Spectral Learning of Weighted Automata , 2016, ICGI.

[33]  Colin de la Higuera,et al.  Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality , 2000, ICML.

[34]  Arie van Deursen,et al.  An Experience Report on Applying Passive Learning in a Large-Scale Payment Company , 2017, ICSME.