Inferring models of concurrent systems from logs of their behavior with CSight

Concurrent systems are notoriously difficult to debug and understand. A common way of gaining insight into system behavior is to inspect execution logs and documentation. Unfortunately, manual inspection of logs is an arduous process, and documentation is often incomplete and out of sync with the implementation. To provide developers with more insight into concurrent systems, we developed CSight. CSight mines logs of a system's executions to infer a concise and accurate model of that system's behavior, in the form of a communicating finite state machine (CFSM). Engineers can use the inferred CFSM model to understand complex behavior, detect anomalies, debug, and increase confidence in the correctness of their implementations. CSight's only requirement is that the logged events have vector timestamps. We provide a tool that automatically adds vector timestamps to system logs. Our tool prototypes are available at http://synoptic.googlecode.com/. This paper presents algorithms for inferring CFSM models from traces of concurrent systems, proves them correct, provides an implementation, and evaluates the implementation in two ways: by running it on logs from three different networked systems and via a user study that focused on bug finding. Our evaluation finds that CSight infers accurate models that can help developers find bugs.

[1]  Dalal Alrajeh,et al.  Learning operational requirements from goal models , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[2]  Yuriy Brun,et al.  Mining precise performance-aware behavioral models from existing instrumentation , 2014, ICSE Companion.

[3]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[4]  Xuezheng Liu,et al.  D3S: Debugging Deployed Distributed Systems , 2008, NSDI.

[5]  Stephen McCamant,et al.  Inference and enforcement of data structure consistency specifications , 2006, ISSTA '06.

[6]  Edmund M. Clarke,et al.  Counterexample-guided abstraction refinement , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[7]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[8]  Carlo Ghezzi,et al.  Mining behavior models from user-intensive web applications , 2014, ICSE.

[9]  Alexander L. Wolf,et al.  Discovering models of behavior for concurrent workflows , 2004, Comput. Ind..

[10]  Haifeng Chen,et al.  Multi-resolution Abnormal Trace Detection Using Varied-length N-grams and Automata , 2005, ICAC.

[11]  Luigi Rizzo,et al.  Dummynet: a simple approach to the evaluation of network protocols , 1997, CCRV.

[12]  Grégoire Sutre,et al.  McScM: A General Framework for the Verification of Communicating Machines , 2012, TACAS.

[13]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[14]  Dimitra Giannakopoulou,et al.  Fluent model checking for event-based systems , 2003, ESEC/FSE-11.

[15]  Grégoire Sutre,et al.  Extrapolation-Based Path Invariants for Abstraction Refinement of Fifo Systems , 2009, SPIN.

[16]  Friedemann Mattern,et al.  Virtual Time and Global States of Distributed Systems , 2002 .

[17]  David Lo,et al.  Automatic steering of behavioral model inference , 2009, ESEC/SIGSOFT FSE.

[18]  Haifeng Chen,et al.  Multiresolution Abnormal Trace Detection Using Varied-Length $n$-Grams and Automata , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[19]  Amin Vahdat,et al.  Pip: Detecting the Unexpected in Distributed Systems , 2006, NSDI.

[20]  Benedikt Bollig,et al.  Learning Communicating Automata from MSCs , 2010, IEEE Transactions on Software Engineering.

[21]  Giuseppe Di Battista,et al.  26 Computer Networks , 2004 .

[22]  Qiang Fu,et al.  Mining Invariants from Console Logs for System Problem Detection , 2010, USENIX Annual Technical Conference.

[23]  Sandeep Kumar,et al.  Inferring class level specifications for distributed systems , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[24]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[25]  David Wetherall,et al.  Computer networks, 5th Edition , 2011 .

[26]  Ion Stoica,et al.  Friday: Global Comprehension for Distributed Replay , 2007, NSDI.

[27]  Axel van Lamsweerde,et al.  Scenarios, goals, and state machines: a win-win partnership for model synthesis , 2006, SIGSOFT '06/FSE-14.

[28]  Xiao Jun Chen,et al.  Construction of Deadlock-free Designs of Communication Protocols from Observation , 2002, Comput. J..

[29]  George S. Avrunin,et al.  Patterns in property specifications for finite-state verification , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[30]  Darren Duc Dao,et al.  Live debugging of distributed systems , 2009, CC.

[31]  Steven P. Reiss,et al.  Encoding program executions , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[32]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[33]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.

[34]  Dana Angluin,et al.  Finding Patterns Common to a Set of Strings , 1980, J. Comput. Syst. Sci..

[35]  Qiang Xu,et al.  Automatic construction of coordinated performance skeletons , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[36]  Amir Pnueli,et al.  Synthesis Revisited: Generating Statechart Models from Scenario-Based Requirements , 2005, Formal Methods in Software and Systems Modeling.

[37]  Yuriy Brun,et al.  Mining temporal invariants from partially ordered logs , 2011, ACM SIGOPS Oper. Syst. Rev..

[38]  Sebastián Uchitel,et al.  Validation of contracts using enabledness preserving finite state abstractions , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[39]  Leonardo Mariani,et al.  Automatic generation of software behavioral models , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[40]  Gregory R. Ganger,et al.  Diagnosing Performance Changes by Comparing Request Flows , 2011, NSDI.

[41]  D UllmanJeffrey,et al.  Introduction to automata theory, languages, and computation, 2nd edition , 2001 .

[42]  Jerome A. Feldman,et al.  On the Synthesis of Finite-State Machines from Samples of Their Behavior , 1972, IEEE Transactions on Computers.

[43]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[44]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[45]  Viktor Kuncak,et al.  CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems , 2009, NSDI.

[46]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[47]  Manuel Blum,et al.  Toward a Mathematical Theory of Inductive Inference , 1975, Inf. Control..

[48]  허윤정,et al.  Holzmann의 ˝The Model Checker SPIN˝에 대하여 , 1998 .

[49]  Daniel Brand,et al.  On Communicating Finite-State Machines , 1983, JACM.

[50]  Hasan Ural,et al.  Towards Design Recovery from Observations , 2004, FORTE.

[51]  Sandeep Kumar,et al.  Mining message sequence graphs , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[52]  Randy H. Katz,et al.  X-Trace: A Pervasive Network Tracing Framework , 2007, NSDI.

[53]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.

[54]  Frits W. Vaandrager,et al.  Automata Learning through Counterexample Guided Abstraction Refinement , 2012, FM.

[55]  Colin J. Fidge,et al.  Timestamps in Message-Passing Systems That Preserve the Partial Ordering , 1988 .

[56]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[57]  Michael D. Ernst,et al.  Automatically patching errors in deployed software , 2009, SOSP '09.

[58]  Michael D. Ernst,et al.  Improving the adaptability of multi-mode systems via program steering , 2004, ISSTA '04.

[59]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[60]  Yuriy Brun,et al.  Synthesizing partial component-level behavior models from system specifications , 2009, ESEC/FSE '09.

[61]  Yuriy Brun,et al.  Unifying FSM-inference algorithms through declarative specification , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[62]  Yuriy Brun,et al.  Leveraging existing instrumentation to automatically infer invariant-constrained models , 2011, ESEC/FSE '11.