Efficient and robust data integrity verification scheme for high-performance storage devices

Most of the data generated on high-performance computing systems are transferred to storage in remote systems for various purposes such as backup. To detect data corruption caused by network or storage failures during data transfer, the receiver system verifies data integrity by comparing the checksum of the data. However, the internal operation of the storage device is not sufficiently investigated in the existing end-to-end integrity verification techniques. In this paper, we propose an efficient and robust data integrity verification scheme for large-scale data transfer between computing systems with high-performance storage devices. To ensure the robustness of the integrity verification, we control the order of I/O operations. In addition, we parallelize checksum computing and overlap it with I/O operations to make the integrity verification efficient.

[1]  Josef Bacik,et al.  BTRFS: The Linux B-Tree Filesystem , 2013, TOS.

[2]  Jie Liu,et al.  SSD Failures in Datacenters: What? When? and Why? , 2016, SYSTOR.

[3]  Hossein Asadi,et al.  Evaluating Reliability of SSD-Based I/O Caches in Enterprise Storage Systems , 2019, ArXiv.

[4]  Sang-Won Lee,et al.  Durable write cache in flash memory SSD for relational and NoSQL databases , 2014, SIGMOD Conference.

[5]  Nong Xiao,et al.  A Dominating Error Region Strategy for Improving the Bit-Flipping LDPC Decoder of SSDs , 2015, IEEE Transactions on Circuits and Systems II: Express Briefs.

[6]  Qiang Wu,et al.  A Large-Scale Study of Flash Memory Failures in the Field , 2015, SIGMETRICS 2015.

[7]  Hemanta Sapkota,et al.  Towards Securing Data Transfers Against Silent Data Corruption , 2019, 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[8]  Michael E. Papka,et al.  Towards optimizing large-scale data transfers with end-to-end integrity verification , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[9]  Onur Mutlu,et al.  Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery , 2017, ArXiv.

[10]  Hong Jiang,et al.  Improving Hybrid FTL by Fully Exploiting Internal SSD Parallelism with Virtual Blocks , 2014, ACM Trans. Archit. Code Optim..

[11]  Onur Mutlu,et al.  Error Characterization, Mitigation, and Recovery in Flash-Memory-Based Solid-State Drives , 2017, Proceedings of the IEEE.

[12]  Leonard Barolli,et al.  Mapping granularity and performance tradeoffs for solid state drive , 2012, The Journal of Supercomputing.

[13]  Engin Arslan,et al.  A Low-Overhead Integrity Verification for Big Data Transfers , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[14]  Paul H. Siegel,et al.  Characterizing flash memory: Anomalies, observations, and applications , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).