Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance
暂无分享,去创建一个
Bin Yang | Kang Chen | Wei Xue | Xu Ji | Qi Chen | Zuo-Ning Chen | Xu Ji | Wei Xue | Kang Chen | Qi Chen | Bin Yang | Zuo-Ning Chen
[1] Robert B. Ross,et al. Optimization Techniques at the I/O Forwarding Layer , 2010, 2010 IEEE International Conference on Cluster Computing.
[2] Christian Scheideler,et al. Towards a Scalable and Robust DHT , 2006, SPAA '06.
[3] Robert Latham,et al. 24/7 Characterization of petascale I/O workloads , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[4] Aditya Akella,et al. Altruistic Scheduling in Multi-Resource Clusters , 2016, OSDI.
[5] Yang Liu,et al. Automatic identification of application I/O signatures from noisy server-side traces , 2014, FAST.
[6] Robert Latham,et al. Scalable I/O forwarding framework for high-performance computing systems , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[7] Raghul Gunasekaran,et al. Understanding I/O workload characteristics of a Peta-scale storage system , 2015, The Journal of Supercomputing.
[8] Fan Guo,et al. Scaling Embedded In-Situ Indexing with DeltaFS , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[9] John Shalf,et al. Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[10] Robert B. Ross,et al. On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[11] Robert B. Ross,et al. Accelerating I/O Forwarding in IBM Blue Gene/P Systems , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[12] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[13] John Bent,et al. PLFS: a checkpoint filesystem for parallel applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[14] Wang Teng,et al. An Ephemeral Burst-Buffer File System for Scientific Applications , 2016 .
[15] Valentina Timcenko,et al. Ext4 file system performance analysis in linux environment , 2011 .
[16] David A Dillow,et al. Lessons Learned in Deploying the World’s Largest Scale Lustre File System , 2010 .
[17] Toni Cortes,et al. Using filesystem virtualization to avoid metadata bottlenecks , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[18] André Brinkmann,et al. GekkoFS - A Temporary Distributed File System for HPC Applications , 2018, 2018 IEEE International Conference on Cluster Computing (CLUSTER).
[19] Wei-keng Liao,et al. Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols , 2008, HiPC 2008.
[20] Bran Selic,et al. A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems , 2013, The Journal of Supercomputing.
[21] Frank B. Schmuck,et al. GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.
[22] Kamil Iskra,et al. ZOID: I/O-forwarding infrastructure for petascale architectures , 2008, PPoPP.
[23] Zhe Zhang,et al. Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems , 2011, 30th IEEE International Performance Computing and Communications Conference.
[24] Karsten Schwan,et al. Managing Variability in the IO Performance of Petascale Storage Systems , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[25] S GunawiHaryadi,et al. Fail-Slow at Scale , 2018 .
[26] Shane Snyder,et al. A Year in the Life of a Parallel File System , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[27] Michael A. Bender,et al. File Systems Fated for Senescence? Nonsense, Says Science! , 2017, FAST.
[28] Jie Yao,et al. ROS , 2018, ACM Transactions on Storage.
[29] Bingsheng He,et al. Incorporating Probabilistic Optimizations for Resource Provisioning of Data Processing Workflows , 2019, ICPP.
[30] Robert B. Ross,et al. Fail-Slow at Scale , 2018, ACM Trans. Storage.
[31] Wenguang Chen,et al. ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[32] Weiguo Liu,et al. Automatic, Application-Aware I/O Forwarding Resource Allocation , 2019, FAST.
[33] Franck Cappello,et al. Scheduling the I/O of HPC Applications Under Congestion , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[34] S.A. Brandt,et al. CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[35] André Brinkmann,et al. A Configurable Rule based Classful Token Bucket Filter Network Request Scheduler for the Lustre File System , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[36] Felix Wolf,et al. Scalable massively parallel I/O to task-local files , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[37] Srikanth Kandula,et al. This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Graphene: Packing and Dependency-aware Scheduling for Data-parallel Clusters G: Packing and Dependency-aware Scheduling for Data-parallel Clusters , 2022 .
[38] Andrew J. Hutton,et al. Lustre: Building a File System for 1,000-node Clusters , 2003 .
[39] Weiguo Liu,et al. End-to-end I/O Monitoring on Leading Supercomputers , 2022, NSDI.