L*-Based Learning of Markov Decision Processes (Extended Version)

Automata learning techniques automatically generate system models from test observations. These techniques usually fall into two categories: passive and active. Passive learning uses a predetermined data set, e.g., system logs. In contrast, active learning actively queries the system under learning, which is considered more efficient. An influential active learning technique is Angluin's L* algorithm for regular languages which inspired several generalisations from DFAs to other automata-based modelling formalisms. In this work, we study L*-based learning of deterministic Markov decision processes, first assuming an ideal setting with perfect information. Then, we relax this assumption and present a novel learning algorithm that collects information by sampling system traces via testing. Experiments with the implementation of our sampling-based algorithm suggest that it achieves better accuracy than state-of-the-art passive learning techniques with the same amount of test data. Unlike existing learning algorithms with predefined states, our algorithm learns the complete model structure including the states.

[1]  Francesco Bergadano,et al.  Learning Behaviors of Automata from Multiplicity and Equivalence Queries , 1996, SIAM J. Comput..

[2]  Kim G. Larsen,et al.  The BisimDist Library: Efficient Computation of Bisimilarity Distances for Markovian Models , 2013, QEST.

[3]  Bengt Jonsson,et al.  Active learning for extended finite state machines , 2016, Formal Aspects of Computing.

[4]  Roland Groz,et al.  Inferring Mealy Machines , 2009, FM.

[5]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[6]  Marta Z. Kwiatkowska,et al.  Automated Verification Techniques for Probabilistic Systems , 2011, SFM.

[7]  Ufuk Topcu,et al.  Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints , 2014, Robotics: Science and Systems.

[8]  Marta Z. Kwiatkowska,et al.  PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[9]  Jan Tretmans,et al.  Test Generation with Inputs, Outputs and Repetitive Quiescence , 1996, Softw. Concepts Tools.

[10]  Bernhard K. Aichernig,et al.  Efficient Active Automata Learning via Mutation Testing , 2018, Journal of Automated Reasoning.

[11]  Marta Z. Kwiatkowska,et al.  Analysis of a gossip protocol in PRISM , 2008, PERV.

[12]  Jan Tretmans,et al.  Model Based Testing with Labelled Transition Systems , 2008, Formal Methods and Testing.

[13]  Frits W. Vaandrager,et al.  Model learning , 2017, Commun. ACM.

[14]  Jan Tretmans,et al.  Approximate Active Learning of Nondeterministic Input Output Transition Systems , 2015, Electron. Commun. Eur. Assoc. Softw. Sci. Technol..

[15]  A. Nerode,et al.  Linear automaton transformations , 1958 .

[16]  Mark Harman,et al.  Formal Methods and Testing, An Outcome of the FORTEST Network, Revised Selected Papers , 2008, Formal Methods and Testing.

[17]  José Oncina,et al.  Learning deterministic regular grammars from stochastic samples in polynomial time , 1999, RAIRO Theor. Informatics Appl..

[18]  Bernhard K. Aichernig,et al.  Model-Based Testing IoT Communication via Active Automata Learning , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[19]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[20]  Wen-Guey Tzeng,et al.  Learning Probabilistic Automata and Markov Chains via Queries , 1992, Machine Learning.

[21]  Tiziana Margaria,et al.  Efficient test-based model generation for legacy reactive systems , 2004, Proceedings. Ninth IEEE International High-Level Design Validation and Test Workshop (IEEE Cat. No.04EX940).

[22]  Yingke Chen,et al.  Active Learning of Markov Decision Processes for System Verification , 2012, 2012 11th International Conference on Machine Learning and Applications.

[23]  Kim G. Larsen,et al.  Learning Probabilistic Automata for Model Checking , 2011, 2011 Eighth International Conference on Quantitative Evaluation of SysTems.

[24]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[25]  Bernhard K. Aichernig,et al.  Model Learning and Model-Based Testing , 2018, Machine Learning for Dynamic Software Analysis.

[26]  Lu Feng,et al.  Learning-Based Compositional Verification for Synchronous Probabilistic Systems , 2011, ATVA.

[27]  Carlo Ghezzi,et al.  Mining behavior models from user-intensive web applications , 2014, ICSE.

[28]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[29]  Ronald L. Rivest,et al.  Inference of finite automata using homing sequences , 1989, STOC '89.

[30]  Kim G. Larsen,et al.  Computing Behavioral Distances, Compositionally , 2013, MFCS.

[31]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[32]  Bernhard Steffen,et al.  Active Automata Learning in Practice - An Annotated Bibliography of the Years 2011 to 2016 , 2018, Machine Learning for Dynamic Software Analysis.

[33]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[34]  Bernhard K. Aichernig,et al.  Learning Abstracted Non-deterministic Finite State Machines , 2020, ICTSS.

[35]  Christel Baier,et al.  Principles of model checking , 2008 .

[36]  Kim G. Larsen,et al.  Learning Markov Decision Processes for Model Checking , 2012, QFM.

[37]  Colin de la Higuera,et al.  Grammatical Inference: Learning Automata and Grammars , 2010 .

[38]  Tim A. C. Willemse Heuristics for ioco -Based Test-Based Modelling , 2006, FMICS/PDMC.

[39]  Bernhard Steffen,et al.  Introduction to Active Automata Learning from a Practical Perspective , 2011, SFM.

[40]  Ricard Gavaldà,et al.  Learning Probability Distributions Generated by Finite-State Machines , 2016 .

[41]  Joeri de Ruiter,et al.  Protocol State Fuzzing of TLS Implementations , 2015, USENIX Security Symposium.

[42]  Bernhard K. Aichernig,et al.  Learning from Faults: Mutation Testing in Active Automata Learning , 2017, NFM.

[43]  Mariëlle Stoelinga,et al.  An Introduction to Probabilistic Automata , 2002, Bull. EATCS.

[44]  Axel Legay,et al.  Faster Statistical Model Checking by Means of Abstraction and Learning , 2014, RV.

[45]  Erik P. de Vink,et al.  Probabilistic Automata: System Types, Parallel Composition and Comparison , 2004, Validation of Stochastic Systems.

[46]  Bernhard Steffen,et al.  The TTT Algorithm: A Redundancy-Free Approach to Active Automata Learning , 2014, RV.

[47]  Hardi Hungar,et al.  Domain-Specific Optimization in Automata Learning , 2003, CAV.

[48]  Valérie Issarny,et al.  Formal methods for eternal networked software systems : 11th International School on Formal Methods for the Design of Computer, Communication and Software Systems, SFM 2011 ; Bertinoro, Italy, June 13-18, 2011. Advanced lectures , 2011 .

[49]  Graham J. Williams,et al.  Data Mining - Theory, Methodology, Techniques, and Applications , 2006, Lecture Notes in Computer Science.

[50]  Frits W. Vaandrager,et al.  Combining Model Learning and Model Checking to Analyze TCP Implementations , 2016, CAV.

[51]  Bernhard K. Aichernig,et al.  Probabilistic black-box reachability checking (extended version) , 2017, Formal Methods in System Design.

[52]  Vitaly Shmatikov,et al.  Analysis of probabilistic contract signing , 2006 .

[53]  Ali Khalili,et al.  Learning Nondeterministic Mealy Machines , 2014, ICGI.

[54]  Tsun S. Chow,et al.  Testing Software Design Modeled by Finite-State Machines , 1978, IEEE Transactions on Software Engineering.

[55]  Toniann Pitassi,et al.  Guilt-free data reuse , 2017, Commun. ACM.

[56]  Kim G. Larsen,et al.  Learning deterministic probabilistic automata from a model checking perspective , 2016, Machine Learning.

[57]  Nancy A. Lynch,et al.  Probabilistic Simulations for Probabilistic Processes , 1994, Nord. J. Comput..

[58]  Maurice Herlihy,et al.  Fast Randomized Consensus Using Shared Memory , 1990, J. Algorithms.

[59]  Marta Z. Kwiatkowska,et al.  Automated Verification and Strategy Synthesis for Probabilistic Systems , 2013, ATVA.

[60]  Kim G. Larsen,et al.  L*-Based Learning of Markov Decision Processes , 2019, FM.

[61]  Bernhard K. Aichernig,et al.  Probabilistic Black-Box Reachability Checking , 2017, RV.

[62]  Edmund M. Clarke,et al.  Learning Probabilistic Systems from Tree Samples , 2012, 2012 27th Annual IEEE Symposium on Logic in Computer Science.

[63]  Gerhard Goos,et al.  Machine Learning for Dynamic Software Analysis: Potentials and Limits , 2018, Lecture Notes in Computer Science.