Lustre, as one of the most popular parallel file systems in high-performance computing (HPC), provides POSIX interface and maintains a large set of POSIX-related metadata, which could be corrupted due to hardware failures, software bugs, configuration errors, etc. The Lustre file system checker (LFSCK) is the remedy tool to detect metadata inconsistencies and to restore a corrupted Lustre to a valid state, hence is critical for reliable HPC. Unfortunately, in practice, LFSCK runs slow in large deployment, making system administrators reluctant to use it as a routine maintenance tool. Consequently, cascading errors may lead to unrecoverable failures, resulting in significant downtime or even data loss. Given the fact that HPC is rapidly marching to Exascale and much larger Lustre file systems are being deployed, it is critical to understand the performance of LFSCK. In this paper, we study the performance of LFSCK to identify its bottlenecks and analyze its performance potentials. Specifically, we design an aging method based on real-world HPC workloads to age Lustre to representative states, and then systematically evaluate and analyze how LFSCK runs on such an aged Lustre via monitoring the utilization of various resources. From our experiments, we find out that the design and implementation of LFSCK is sub-optimal. It consists of scalability bottleneck on the metadata server (MDS), relatively high fan-out ratio in network utilization, and unnecessary blocking among internal components. Based on these observations, we discussed potential optimization and present some preliminary results.
[1]
Margo I. Seltzer,et al.
File system aging—increasing the relevance of file system benchmarks
,
1997,
SIGMETRICS '97.
[2]
Robert B. Ross,et al.
Fail-Slow at Scale
,
2018,
ACM Trans. Storage.
[3]
L. Vivier,et al.
The new ext 4 filesystem : current status and future plans
,
2007
.
[4]
Yong Chen,et al.
PFault: A General Framework for Analyzing the Reliability of High-Performance Parallel File Systems
,
2018,
ICS.
[5]
Andrea C. Arpaci-Dusseau,et al.
An analysis of data corruption in the storage stack
,
2008,
TOS.
[6]
Val Henson,et al.
The Zettabyte File System
,
2003
.
[7]
Eric Eide,et al.
Introducing CloudLab: Scientific Infrastructure for Advancing Cloud Architectures and Applications
,
2014,
login Usenix Mag..
[8]
Robert Mateescu,et al.
Towards Robust File System Checkers
,
2018,
FAST.
[9]
Andrea C. Arpaci-Dusseau,et al.
A Study of Linux File System Evolution
,
2013,
FAST.
[10]
Garth A. Gibson,et al.
Aging Gracefully with Geriatrix : A File System Aging Suite
,
2016
.
[11]
Robert Latham,et al.
24/7 Characterization of petascale I/O workloads
,
2009,
2009 IEEE International Conference on Cluster Computing and Workshops.