An exploration of targeted provenance for reliability

Recent advances show that system-wide provenance gathering can be performed with reasonable overhead. In addition, generative dependencies have been used to recover files for non-reliability purposes. We pose the following research question: Can provenance be used to regenerate lost files in the event of storage failures to improve reliability? We have designed, prototyped, and assessed the feasibility of the Legend file system, which exploits targeted provenance to regenerate lost files in order to improve reliability. Under a number of workloads, we have observed Legend when combined with 2-way replication can outperform or match 3-way replication. The regeneration rate exceeds the throttled recovery bandwidth for common RAIDs.

[1]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[2]  Neeraj Suri,et al.  Using Underutilized CPU Resources to Enhance Its Reliability , 2010, IEEE Transactions on Dependable and Secure Computing.

[3]  Steven Swanson,et al.  The bleak future of NAND flash memory , 2012, FAST.

[4]  Michael Chow,et al.  Eidetic Systems , 2014, OSDI.

[5]  Eyal de Lara,et al.  The taser intrusion recovery system , 2005, SOSP '05.

[6]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[7]  Andrea C. Arpaci-Dusseau,et al.  Association Proceedings of the Third USENIX Conference on File and Storage Technologies San Francisco , CA , USA March 31 – April 2 , 2004 , 2004 .

[8]  Xi Wang,et al.  Intrusion Recovery Using Selective Re-execution , 2010, OSDI.

[9]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[10]  Geoffrey H. Kuenning,et al.  Automated hoarding for mobile computers , 1997, SOSP.

[11]  Jason Flinn,et al.  Knockoff: Cheap Versions in the Cloud , 2017, FAST.

[12]  Bianca Schroeder,et al.  Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You? , 2007, FAST.

[13]  Sachin Katti,et al.  Copysets: Reducing the Frequency of Data Loss in Cloud Storage , 2013, USENIX Annual Technical Conference.

[14]  Hao Chen,et al.  Back to the Future: A Framework for Automatic Malware Removal and System Repair , 2006, 2006 22nd Annual Computer Security Applications Conference (ACSAC'06).

[15]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.