Tracking in a spaghetti bowl: monitoring transactions using footprints

The problem of tracking end-to-end service-level transactions in the absence of instrumentation support is considered. The transaction instances progress through a state-transition model and generate time-stamped footprints on entering each state in the model. The goal is to track individual transactions using these footprints even when the footprints may not contain any tokens uniquely identifying the transaction instances that generated them. Assuming a semi-Markov process model for state transitions, the transaction instances are tracked probabilistically by matching them to the available footprints according to the maximum likelihood (ML) criterion. Under the ML-rule, for a two-state system, it is shown that the probability that all the instances are matched correctly is minimized when the transition times are i.i.d. exponentially distributed. When the transition times are i.i.d. distributed, the ML-rule reduces to a minimum weight bipartite matching and reduces further to a first-in first-out match for a special class of distributions. For a multi-state model with an acyclic state transition digraph, a constructive proof shows that the ML-rule reduces to splicing the results of independent matching of many bipartite systems.

[1]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[2]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries , 2004, JACM.

[3]  David A. Patterson,et al.  Path-Based Failure and Evolution Management , 2004, NSDI.

[4]  Pierre Hansen,et al.  Perfect matchings and ears in elementary bipartite graphs , 1996, Discret. Math..

[5]  Dries R. Goossens,et al.  The transportation problem with exclusionary side constraints , 2009, 4OR.

[6]  W. D. Wallis Systems of Distinct Representatives , 1997 .

[7]  Phillip A. Ostrand Systems of distinct representatives, II , 1970 .

[8]  Patrick Billingsley,et al.  Probability and Measure. , 1986 .

[9]  Rainer E. Burkard,et al.  Selected topics on assignment problems , 2002, Discret. Appl. Math..

[10]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[11]  Anima Anandkumar,et al.  Non-intrusive transaction monitoring using system logs , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[12]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.

[13]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[14]  Gianfranco Lamperti,et al.  Diagnosis of active systems : principles and techniques , 2003 .

[15]  Andrea J. Borr Transaction Monitoring in ENCOMPASS: Reliable Distributed Transaction Processing , 1981, VLDB.

[16]  Wei Peng,et al.  Mining logs files for data-driven system management , 2005, SKDD.

[17]  A. Aggarwal,et al.  Efficient minimum cost matching using quadrangle inequality , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[18]  Albert Benveniste,et al.  Distributed Monitoring of Concurrent and Asynchronous Systems , 2003, CONCUR.

[19]  W. Bodmer Discrete Stochastic Processes in Population Genetics , 1960 .

[20]  Boudewijn F. van Dongen,et al.  Discovering Workflow Performance Models from Timed Logs , 2002, EDCIS.

[21]  Reinhold Kröger,et al.  A Generic Application-Oriented Performance Instrumentation for Multi-Tier Environments , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[22]  M. Mansouri-Samani,et al.  Monitoring distributed systems , 1993, IEEE Network.

[23]  Henry B. Mann,et al.  Systems of Distinct Representatives , 1953 .

[24]  Rauf Izmailov,et al.  Real-time Application Monitoring and Diagnosis for Service Hosting Platforms of Black Boxes , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[25]  Marie-Odile Cordier,et al.  A Decentralized Model-Based Diagnostic Tool for Complex Systems , 2002, Int. J. Artif. Intell. Tools.

[26]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[27]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[28]  Daniel A. Reed,et al.  Monitoring Large Systems Via Statistical Sampling , 2004, Int. J. High Perform. Comput. Appl..

[29]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[30]  Murthy V. Devarakonda,et al.  Galapagos: Automatically Discovering Application-Data Relationships in Networked Systems , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[31]  R. Vaarandi Tools and Techniques for Event Log Analysis , 2005 .

[32]  John P. Rouillard Real-time Log File Analysis Using the Simple Event Correlator (SEC) , 2004, LISA.