System-Call Based Problem Diagnosis for PVFS

Abstract : We present a syscall-based approach to automatically diagnose performance problems, server-to-client propagated errors, and server crash/hang problems in PVFS. Our approach compares the statistical and semantic attributes of syscalls across PVFS servers in order to diagnose the culprit server, under these problems, for different file-system benchmarks-dd, PostMark and IOzone-in a PVFS cluster.

[1]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[2]  Armando Fox,et al.  Detecting application-level failures in component-based Internet services , 2005, IEEE Transactions on Neural Networks.

[3]  Wu-chi Feng,et al.  Forensix: a robust, high-performance reconstruction system , 2005, 25th IEEE International Conference on Distributed Computing Systems Workshops.

[4]  Armando Fox,et al.  Capturing, indexing, clustering, and retrieving system history , 2005, SOSP '05.

[5]  Andrea C. Arpaci-Dusseau,et al.  Model-based failure analysis of journaling file systems , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[6]  Julio César López-Hernández,et al.  Stardust: tracking activity in a distributed storage system , 2006, SIGMETRICS '06/Performance '06.

[7]  Arkady Kanevsky,et al.  Are disks the dominant contributor for storage failures?: A comprehensive study of storage subsystem failure characteristics , 2008, TOS.

[8]  Barton P. Miller,et al.  Automated problem diagnosis in distributed systems , 2006 .

[9]  Xin Li,et al.  Reference-driven performance anomaly identification , 2009, SIGMETRICS '09.

[10]  Erez Zadok,et al.  Tracefs: A File System to Trace Them All , 2004, FAST.

[11]  Nikolai Joukov,et al.  Operating system profiling via latency analysis , 2006, OSDI '06.

[12]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[13]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[14]  Andrea C. Arpaci-Dusseau,et al.  IRON file systems , 2005, SOSP '05.

[15]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.