Trace-based Behaviour Analysis of Network Servers

Analysing software and networks can be done using established tools, such as debuggers and packet analysers, but using established tools to analyse network software is difficult and impractical because of the sheer detail the tools present and the performance overheads they typically impose. This makes it difficult to precisely diagnose performance anomalies in network software to identify their causes (is it a DoS attack or a bug?) and determine what needs to be fixed.We present Flowdar: a practical tool for analysing software traces to produce intuitive summaries of network software behaviour by abstracting unimportant details and demultiplexing traces into different sessions’ subtraces. Flowdar can use existing state-of-the-art tracing tools for lower overhead during trace gathering for offline analysis. Using Flowdar we can drill down when diagnosing performance anomalies without getting overwhelmed in detail or burdening the system being observed.We show that Flowdar can be applied to existing real-world software and can digest complex behaviour into an intuitive visualisation.

[1]  E. N. Elnozahy Address trace compression through loop detection and reduction , 1999, SIGMETRICS '99.

[2]  Kristin Decker,et al.  Uml Distilled A Brief Guide To The Standard Object Modeling Language , 2016 .

[3]  Evimaria Terzi,et al.  Constructing comprehensive summaries of large event sequences , 2008, KDD.

[4]  Daniel A. Reed,et al.  SvPablo: A multi-language architecture-independent performance analysis system , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[5]  Jaideep Chandrashekar,et al.  Perspectives on tracing end-hosts: a survey summary , 2010, CCRV.

[6]  Ning Wang,et al.  XRay: A Function Call Tracing System , 2016 .

[7]  Scott A. Mahlke,et al.  Instant profiling: Instrumentation sampling for profiling datacenter applications , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[8]  Lars-Åke Fredlund,et al.  Trace analysis of Erlang programs , 2002, Erlang Workshop.

[9]  Gunar E. Liepins,et al.  Detection of anomalous computer session activity , 1989, Proceedings. 1989 IEEE Symposium on Security and Privacy.

[10]  Ding Yuan,et al.  Pensieve: Non-Intrusive Failure Reproduction for Distributed Systems using the Event Chaining Approach , 2017, SOSP.

[11]  Martin Schulz,et al.  Stack Trace Analysis for Large Scale Debugging , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[12]  Madeline Diep,et al.  Reducing irrelevant trace variations , 2007, ASE '07.

[13]  Alexandru Iosup,et al.  The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[14]  Larry Rudolph,et al.  DEP: Detailed execution profile , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[15]  Bryan Cantrill,et al.  Dynamic Instrumentation of Production Systems , 2004, USENIX Annual Technical Conference, General Track.

[16]  Fangzhe Chang,et al.  Validating system properties exhibited in execution traces , 2007, ASE.

[17]  Wei Xu,et al.  Advances and challenges in log analysis , 2011, Commun. ACM.

[18]  Vern Paxson,et al.  Automated packet trace analysis of TCP implementations , 1997, SIGCOMM '97.

[19]  Arie van Deursen,et al.  Execution trace analysis through massive sequence and circular bundle views , 2008, J. Syst. Softw..

[20]  Allen D. Malony,et al.  Traceview: a trace visualization tool , 1991, IEEE Software.

[21]  Abdelwahab Hamou-Lhadj,et al.  Summarizing the Content of Large Traces to Facilitate the Understanding of the Behaviour of a Software System , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[22]  Arie van Deursen,et al.  A Systematic Survey of Program Comprehension through Dynamic Analysis , 2008, IEEE Transactions on Software Engineering.

[23]  Henry Zhu,et al.  Making Break-ups Less Painful: Source-level Support for Transforming Legacy Software into a Network of Tasks , 2018 .

[24]  Peter Alvaro,et al.  Abstracting the Geniuses Away from Failure Testing , 2017, ACM Queue.

[25]  Keith D. Cooper,et al.  Combining analyses, combining optimizations , 1995, TOPL.

[26]  Zheng Liu,et al.  FLAP: An End-to-End Event Log Analysis Platform for System Management , 2017, KDD.

[27]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.