Enterprise HPC storage systems

High performance compute systems place an increasing demand on the their primary connected storage systems. Modern filesystem implementations are no longer just for “scratch” only, data availability and data integrity are considered just as important as performance. In today's market place as well as in advanced research, the demands on a storage system increase along a number of fronts including capacity, data integrity, system reliability, and system manageability, in addition to an ever-increasing need for performance. Understanding the tuning parameters available, the enhancements in RAID reliability as well as possible tradeoffs between slightly opposing tuning models becomes a critical skill. The Lustre filesystem has for many years been the most popular distributed filesystem for HPC. While the Lustre community to date has been partial to older Lustre server and client releases, a number of the new features desired by many users, requires moving to more modern versions. The major dilemma historically has been that client versions such as 1.8.× perform faster than the 2.× releases, but support for modern Linux kernels is only available with recent Lustre server releases, and newly-implemented feature sets requires moving to newer client versions. This paper examines the client and server tuning models, security features and the advent of new RAID schemes along with their implications for performance. Specifically, when using benchmarking tools such as IOR, new testing models and parameter sets requires storage I/O benchmarking to change and more closely mimic contemporary application I/O workloads.