Efficiently identifying deterministic real-time automata from labeled data

We develop a novel learning algorithm RTI for identifying a deterministic real-time automaton (DRTA) from labeled time-stamped event sequences. The RTI algorithm is based on the current state of the art in deterministic finite-state automaton (DFA) identification, called evidence-driven state-merging (EDSM). In addition to having a DFA structure, a DRTA contains time constraints between occurrences of consecutive events. Although this seems a small difference, we show that the problem of identifying a DRTA is much more difficult than the problem of identifying a DFA: identifying only the time constraints of a DRTA given its DFA structure is already NP-complete. In spite of this additional complexity, we show that RTI is a correct and complete algorithm that converges efficiently (from polynomial time and data) to the correct DRTA in the limit. To the best of our knowledge, this is the first algorithm that can identify a timed automaton model from time-stamped event sequences.A straightforward alternative to identifying DRTAs is to identify a DFA that models time implicitly, i.e., a DFA that uses different states for different points in time. Such a DFA can be identified by first sampling the timed sequences using a fixed frequency, and subsequently applying EDSM to the resulting non-timed event sequences. We evaluate the performance of both RTI and this sampling approach experimentally on artificially generated data. In these experiments RTI outperforms the sampling approach significantly. Thus, we show that if we obtain data from a real-time system, it is easier to identify a DRTA from this data than to identify an equivalent DFA.

[1]  A. Pnueli,et al.  CONTROLLER SYNTHESIS FOR TIMED AUTOMATA , 2006 .

[2]  Y. Guédon Estimating Hidden Semi-Markov Chains From Discrete Sequences , 2003 .

[3]  Cees Witteveen,et al.  One-Clock Deterministic Timed Automata Are Efficiently Identifiable in the Limit , 2009, LATA.

[4]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1989, [1989] Proceedings. Structure in Complexity Theory Fourth Annual Conference.

[5]  Mahesh Viswanathan,et al.  Learning continuous time Markov chains from sample executions , 2004, First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings..

[6]  John F. Roddick,et al.  A Survey of Temporal Knowledge Discovery Paradigms and Methods , 2002, IEEE Trans. Knowl. Data Eng..

[7]  Colin de la Higuera,et al.  A bibliographical study of grammatical inference , 2005, Pattern Recognit..

[8]  Pierre Dupont,et al.  Stochastic Grammatical Inference with Multinomial Tests , 2002, ICGI.

[9]  Joseph Sifakis,et al.  Controller Synthesis for Timed Automata 1 , 1998 .

[10]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[11]  Éric Tanter,et al.  Supporting dynamic crosscutting with partial behavioral reflection: a case study , 2004, XXIV International Conference of the Chilean Computer Science Society.

[12]  Wang Yi,et al.  Uppaal in a nutshell , 1997, International Journal on Software Tools for Technology Transfer.

[13]  J. Oncina,et al.  INFERRING REGULAR LANGUAGES IN POLYNOMIAL UPDATED TIME , 1992 .

[14]  Arlindo L. Oliveira,et al.  Inference of regular languages using state merging algorithms with search , 2005, Pattern Recognit..

[15]  Alexander Clark,et al.  PAC-learnability of Probabilistic Deterministic Finite State Automata , 2004, J. Mach. Learn. Res..

[16]  Thomas Sudkamp Languages and Machines: An Introduction to the Theory of Computer Science , 2005 .

[17]  Catalin Dima,et al.  Real-Time Automata , 2001, J. Autom. Lang. Comb..

[18]  Fabian Mörchen,et al.  Mining Hierarchical Temporal Patterns in Multivariate Time Series , 2004, KI.

[19]  Barak A. Pearlmutter,et al.  Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm , 1998, ICGI.

[20]  Rajeev Alur,et al.  A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[21]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[22]  Michael Sipser,et al.  Introduction to the Theory of Computation , 1996, SIGA.

[23]  Cees Witteveen,et al.  The efficiency of identifying timed automata and the power of clocks , 2011, Inf. Comput..

[24]  Sally A. Goldman,et al.  Teaching a Smarter Learner , 1996, J. Comput. Syst. Sci..

[25]  Cees Witteveen,et al.  Polynomial Distinguishability of Timed Automata , 2008, ICGI.

[26]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[27]  Wojciech Rytter,et al.  On the Maximal Number of Cubic Runs in a String , 2010, LATA.

[28]  Colin de la Higuera Characteristic Sets for Polynomial Grammatical Inference , 1997 .

[29]  Frits W. Vaandrager,et al.  Testing timed automata , 1997, Theor. Comput. Sci..

[30]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[31]  Chet Langin,et al.  Languages and Machines: An Introduction to the Theory of Computer Science , 2007 .

[32]  Bengt Jonsson,et al.  Inference of Event-Recording Automata Using Timed Decision Trees , 2006, CONCUR.

[33]  Pierre Dupont,et al.  Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms , 2005, Pattern Recognit..

[34]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within and polynomial , 1989, STOC '89.