Delivering Supercomputing to the Ultrascale

Computational simulations run on large supercomputers balance their outputs with the need of the scientist and the capability of the machine. Persistent storage is typically expensive and slow, its peformance grows at a slower rate than the processing power of the machine. This forces scientists to be practical about the size and frequency of the simulation outputs that can be later analyzed to understand the simulation states. Flexibility in the trade-offs of flexibilty and accessibility of the outputs of the simulations are critical the success of scientists using the supercomputers to understand their science. In situ transformations of the simulation state to be persistently stored is the focus of this dissertation. The extreme size and parallelism of simulations can cause challenges for visualization and data analysis. This is coupled with the need to accept pre partitioned data into the analysis algorithms, which is not always well oriented toward existing software infrastructures. The work in this dissertation is focused on improving current work flows and software to accept data as it is, and efficiently produce smaller, more information rich data, for persistent storage that is easily consumed by end-user scientists. I attack this problem from both a theoretical and practical basis, by managing completely raw data to quantities of information dense visualizations and study methods for managing both the creation and persistence of data products from large scale simulations.

[1]  Kwan-Liu Ma,et al.  SLIC: scheduled linear image compositing for parallel volume rendering , 2003, IEEE Symposium on Parallel and Large-Data Visualization and Graphics, 2003. PVG 2003..

[2]  Dean N. Williams,et al.  A model for optimizing file access patterns using spatio-temporal parallelism , 2013, UltraVis@SC.

[3]  Alexander S. Szalay,et al.  The Science Archive for the Sloan Digital Sky Survey , 1996 .

[4]  Robert Weaver,et al.  The RAGE radiation-hydrodynamic code , 2008 .

[5]  Jeremy S. Meredith,et al.  Parallel in situ coupling of simulation with a fully featured visualization system , 2011, EGPGV '11.

[6]  Jun Wang,et al.  VisIO: Enabling Interactive Visualization of Ultra-Scale, Time Series Data via High-Bandwidth Distributed I/O Systems , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[7]  M. Berger,et al.  Adaptive mesh refinement for hyperbolic partial differential equations , 1982 .

[8]  James P. Ahrens,et al.  In‐situ Sampling of a Large‐Scale Particle Simulation for Interactive Visualization and Analysis , 2011, Comput. Graph. Forum.

[9]  William J. Schroeder,et al.  The Visualization Toolkit , 2005, The Visualization Handbook.

[10]  Mateu Sbert,et al.  Automatic View Selection Using Viewpoint Entropy and its Application to Image‐Based Modelling , 2003, Comput. Graph. Forum.

[11]  Franck Cappello,et al.  Fast Error-Bounded Lossy HPC Data Compression with SZ , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[12]  A. Ferrari,et al.  PLUTO: A Numerical Code for Computational Astrophysics , 2007, astro-ph/0701854.

[13]  Roger Crawfis,et al.  View point evaluation and streamline filtering for flow visualization , 2011, 2011 IEEE Pacific Visualization Symposium.

[14]  John T. Daly,et al.  A higher order estimate of the optimum checkpoint interval for restart dumps , 2006, Future Gener. Comput. Syst..

[15]  S. N. Milam,et al.  The impact and recovery of asteroid 2008 TC3 , 2009, Nature.

[16]  Deborah Silver,et al.  Quantifying Visualizations for Reduced Modeling in Nonlinear Science: Extracting Structures from Data Sets , 1993, J. Vis. Commun. Image Represent..

[17]  P. Colella,et al.  Local adaptive mesh refinement for shock hydrodynamics , 1989 .

[18]  Kwan-Liu Ma,et al.  A Parallel Visualization Pipeline for Terascale Earthquake Simulations , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[19]  Kenneth I. Joy,et al.  Evaluating the benefits of an extended memory hierarchy for parallel streamline algorithms , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[20]  Kwan-Liu Ma,et al.  An Exploratory Technique for Coherent Visualization of Time‐varying Volume Data , 2010, Comput. Graph. Forum.

[21]  C. C. Law,et al.  ParaView: An End-User Tool for Large-Data Visualization , 2005, The Visualization Handbook.

[22]  Michael E. Papka,et al.  Large-Scale Data Visualization Using Parallel Data Streaming , 2001, IEEE Computer Graphics and Applications.

[23]  David H. Rogers,et al.  Visualization and Analysis of Threats from Asteroid Ocean Impacts , 2016 .

[24]  Carson Brownlee,et al.  A report documenting the completion of the Los Alamos National Laboratory portion of the ASC level II milestone ""Visualization on the supercomputing platform , 2011 .

[25]  Hans Hagen,et al.  In Situ Eddy Analysis in a High-Resolution Ocean Climate Model , 2016, IEEE Transactions on Visualization and Computer Graphics.

[26]  Kwan-Liu Ma,et al.  Importance-Driven Time-Varying Data Visualization , 2008, IEEE Transactions on Visualization and Computer Graphics.

[27]  Kenneth Moreland,et al.  Parallel unstructured volume rendering in ParaView , 2007, Electronic Imaging.

[28]  David Honegger,et al.  Title : Delivery of In Situ Capability to End Users , 2017 .

[29]  James P. Ahrens,et al.  A modular extensible visualization system architecture for culled prioritized data streaming , 2007, Electronic Imaging.

[30]  Hans Hagen,et al.  In Situ and Post Processing Workflows for Asteroid Ablation Studies , 2017, EuroVis.

[31]  Mathew Maltrud,et al.  Interactive remote large-scale data visualization via prioritized multi-resolution streaming , 2009, UltraVis '09.

[32]  Charles D. Hansen,et al.  Semotus Visum: a flexible remote visualization framework , 2002, IEEE Visualization, 2002. VIS 2002..

[33]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[34]  Robert S. Laramee,et al.  The State of the Art in Flow Visualisation: Feature Extraction and Tracking , 2003, Comput. Graph. Forum.

[35]  Han-Wei Shen,et al.  An Information-Aware Framework for Exploring Multivariate Data Sets , 2013, IEEE Transactions on Visualization and Computer Graphics.

[36]  Yuriko Takeshima,et al.  A feature-driven approach to locating optimal viewpoints for volume visualization , 2005, VIS 05. IEEE Visualization, 2005..

[37]  Ken Martin,et al.  Time Dependent Processing in a Parallel Pipeline Architecture , 2007, IEEE Transactions on Visualization and Computer Graphics.

[38]  Hank Childs,et al.  VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data , 2011 .

[39]  Utkarsh Ayachit,et al.  ParaView Catalyst: Enabling In Situ Data Analysis and Visualization , 2015, ISAV@SC.

[40]  B. Fryxell,et al.  FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes , 2000 .

[41]  Bernd Hamann,et al.  Visualization and Analysis of Eddies in a Global Ocean Simulation , 2011, Comput. Graph. Forum.

[42]  Han-Wei Shen,et al.  Dynamic View Selection for Time-Varying Volumes , 2006, IEEE Transactions on Visualization and Computer Graphics.

[43]  David E. DeMarle,et al.  A Study of Ray Tracing Large-scale Scientific Data in Parallel Visualization Applications , 2012 .

[44]  Li-Ta Lo,et al.  LANL CSSE L2: Case Study of In Situ Data Analysis in ASC Integrated Codes , 2013 .

[45]  Sorin Faibish,et al.  Jitter-free co-processing on a prototype exascale storage stack , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[46]  R. Löhner An adaptive finite element scheme for transient problems in CFD , 1987 .

[47]  Bernd Hamann,et al.  Adaptive Extraction and Quantification of Geophysical Vortices , 2011, IEEE Transactions on Visualization and Computer Graphics.

[48]  Chaoli Wang,et al.  Interactive Level-of-Detail Selection Using Image-Based Quality Metric for Large Volume Visualization , 2007, IEEE Transactions on Visualization and Computer Graphics.

[49]  Kalin Kanov,et al.  Run-time creation of the turbulent channel flow database by an HPC simulation using MPI-DB , 2013, EuroMPI.

[50]  Francesco De Simone,et al.  Evaluating lossy data compression on climate simulation data within a large ensemble , 2016, Geoscientific Model Development.

[51]  Chaoli Wang,et al.  LOD Map - A Visual Interface for Navigating Multiresolution Volume Visualization , 2006, IEEE Transactions on Visualization and Computer Graphics.

[52]  Michael L. Norman,et al.  Accelerating data-intensive science with Gordon and Dash , 2010 .

[53]  Francesca Samsel,et al.  Employing Color Theory to Visualize Volume-rendered Multivariate Ensembles of Asteroid Impact Simulations , 2017, CHI Extended Abstracts.

[54]  Kwan-Liu Ma,et al.  VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures , 2016, IEEE Computer Graphics and Applications.

[55]  Robert B. Ross,et al.  Scalable parallel building blocks for custom data analysis , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[56]  Peter S. Gural,et al.  Chelyabinsk Airburst, Damage Assessment, Meteorite Recovery, and Characterization , 2013, Science.

[57]  Han-Wei Shen,et al.  Feature Tracking Using Earth Mover ’ s Distance and Global Optimization , 2006 .

[58]  Prabhat,et al.  Extreme Scaling of Production Visualization Software on Diverse Architectures , 2010, IEEE Computer Graphics and Applications.

[59]  Surendra Byna,et al.  TECA: A Parallel Toolkit for Extreme Climate Analysis , 2012, ICCS.

[60]  John M. Dennis,et al.  Parallel high-resolution climate data analysis using swift , 2011, MTAGS '11.

[61]  Kwan-Liu Ma In situ visualization at extreme scale: challenges and opportunities. , 2009, IEEE computer graphics and applications.

[62]  Gunther H. Weber,et al.  Efficient parallel extraction of crack-free isosurfaces from adaptive mesh refinement (AMR) data , 2012, IEEE Symposium on Large Data Analysis and Visualization (LDAV).

[63]  Xin Tong,et al.  Salient time steps selection from large scale time-varying data sets with dynamic time warping , 2012, IEEE Symposium on Large Data Analysis and Visualization (LDAV).

[64]  Bernd Hamann,et al.  Interface Exchange as an Indicator for Eddy Heat Transport , 2012, Comput. Graph. Forum.

[65]  Kenneth Moreland,et al.  Sandia National Laboratories , 2000 .

[66]  Kwan-Liu Ma,et al.  Intelligent Feature Extraction and Tracking for Visualizing Large-Scale 4D Flow Simulations , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[67]  Robert B. Ross,et al.  On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[68]  Theo van Walsum,et al.  Iconic techniques for feature visualization , 1995, Proceedings Visualization '95.

[69]  Han-Wei Shen,et al.  View selection for volume rendering , 2005, VIS 05. IEEE Visualization, 2005..

[70]  Martin Isenburg,et al.  Parallel and Streaming Generation of Ghost Data for Structured Grids , 2008, IEEE Computer Graphics and Applications.

[71]  Li-Ta Lo,et al.  Petascale visualization: Approaches and initial results , 2008, 2008 Workshop on Ultrascale Visualization.

[72]  T. Tu,et al.  From Mesh Generation to Scientific Visualization: An End-to-End Approach to Parallel Supercomputing , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[73]  James Ahrens,et al.  2016 CSSE L3 Milestone: Deliver In Situ to XTD End Users , 2016 .

[74]  Bernhard Preim,et al.  Viewpoint Selection for Intervention Planning , 2007, EuroVis.

[75]  R. Teyssier Cosmological hydrodynamics with adaptive mesh refinement - A new high resolution code called RAMSES , 2001, astro-ph/0111367.

[76]  Hans Hagen,et al.  Parallel multi-layer ghost cell generation for distributed unstructured grids , 2017, 2017 IEEE 7th Symposium on Large Data Analysis and Visualization (LDAV).

[77]  James P. Ahrens,et al.  Revisiting wavelet compression for large-scale climate data using JPEG 2000 and ensuring data precision , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[78]  Sébastien Jourdain,et al.  In Situ MPAS-Ocean Image-based Visualization , 2014 .

[79]  Matthias Steinmetz Grapesph: cosmological smoothed particle hydrodynamics simulations with the special-purpose hardware GRAPE , 1996 .

[80]  Kwan-Liu Ma,et al.  A study of I/O methods for parallel visualization of large-scale data , 2005, Parallel Comput..

[81]  Ghassan Hamarneh,et al.  Visualization and exploration of time-varying medical image data sets , 2007, GI '07.

[82]  Utkarsh Ayachit,et al.  The ParaView Visualization Application , 2012, High Performance Visualization.

[83]  Alexander S. Szalay,et al.  Extreme Event Analysis in Next Generation Simulation Architectures , 2017, ISC.

[84]  Ronald Fedkiw,et al.  Simulating water and smoke with an octree data structure , 2004, ACM Trans. Graph..

[85]  Michael E. Papka,et al.  Toward simulation-time data analysis and I/O acceleration on leadership-class systems , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[86]  James P. Ahrens,et al.  ADR visualization: A generalized framework for ranking large-scale scientific data using Analysis-Driven Refinement , 2014, 2014 IEEE 4th Symposium on Large Data Analysis and Visualization (LDAV).

[87]  Ray W. Grout,et al.  Ultrascale Visualization In Situ Visualization for Large-Scale Combustion Simulations , 2010 .

[88]  S. Popinet Gerris: a tree-based adaptive solver for the incompressible Euler equations in complex geometries , 2003 .

[89]  James P. Ahrens,et al.  An Image-Based Approach to Extreme Scale in Situ Visualization and Analysis , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[90]  Nelson L. Max,et al.  A contract based system for large data visualization , 2005, VIS 05. IEEE Visualization, 2005..

[91]  Robert Latham,et al.  Toward a General I/O Layer for Parallel-Visualization Applications , 2011, IEEE Computer Graphics and Applications.