Active Learning of Markov Decision Processes for System Verification

Formal model verification has proven a powerful tool for verifying and validating the properties of a system. Central to this class of techniques is the construction of an accurate formal model for the system being investigated. Unfortunately, manual construction of such models can be a resource demanding process, and this shortcoming has motivated the development of algorithms for automatically learning system models from observed system behaviors. Recently, algorithms have been proposed for learning Markov decision process representations of reactive systems based on alternating sequences of input/output observations. While alleviating the problem of manually constructing a system model, the collection/generation of observed system behaviors can also prove demanding. Consequently we seek to minimize the amount of data required. In this paper we propose an algorithm for learning deterministic Markov decision processes from data by actively guiding the selection of input actions. The algorithm is empirically analyzed by learning system models of slot machines, and it is demonstrated that the proposed active learning procedure can significantly reduce the amount of data required to obtain accurate system models.

[1]  D. N. Jansen Probabilistic UML statecharts for specification and verification: a case study , 2002 .

[2]  Daphne Koller,et al.  Active Learning for Parameter Estimation in Bayesian Networks , 2000, NIPS.

[3]  Frits W. Vaandrager,et al.  Learning I/O Automata , 2010, CONCUR.

[4]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[5]  Brigham Anderson,et al.  Active learning for Hidden Markov Models: objective functions and algorithms , 2005, ICML.

[6]  Bernhard Steffen,et al.  Introduction to Active Automata Learning from a Practical Perspective , 2011, SFM.

[7]  Éric Tanter,et al.  Supporting dynamic crosscutting with partial behavioral reflection: a case study , 2004, XXIV International Conference of the Chilean Computer Science Society.

[8]  Mahesh Viswanathan,et al.  Learning continuous time Markov chains from sample executions , 2004, First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings..

[9]  Christel Baier,et al.  Principles of model checking , 2008 .

[10]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[11]  Marta Z. Kwiatkowska,et al.  PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[12]  Fred Kröger,et al.  Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.

[13]  Kim G. Larsen,et al.  Learning Probabilistic Automata for Model Checking , 2011, 2011 Eighth International Conference on Quantitative Evaluation of SysTems.

[14]  Kim G. Larsen,et al.  Learning Markov Models for Stationary System Behaviors , 2012, NASA Formal Methods.

[15]  Daphne Koller,et al.  Active learning: theory and applications , 2001 .

[16]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[17]  Morris H. DeGroot,et al.  Optimal Statistical Decisions: DeGroot/Statistical Decisions WCL , 2005 .

[18]  Hod Lipson,et al.  Active Coevolutionary Learning of Deterministic Finite Automata , 2005, J. Mach. Learn. Res..

[19]  Kim G. Larsen,et al.  Learning Markov Decision Processes for Model Checking , 2012, QFM.

[20]  M. Degroot Optimal Statistical Decisions , 1970 .

[21]  Lijun Zhang,et al.  From Concurrency Models to Numbers - Performance and Dependability , 2011, Software and Systems Safety - Specification and Verification.

[22]  Christel Baier,et al.  PROBMELA: a modeling language for communicating probabilistic processes , 2004, Proceedings. Second ACM and IEEE International Conference on Formal Methods and Models for Co-Design, 2004. MEMOCODE '04..