Performance Modeling of In Situ Rendering

With the push to exascale, in situ visualization and analysis will continue to play an important role in high performance computing. Tightly coupling in situ visualization with simulations constrains resources for both, and these constraints force a complex balance of trade-offs. A performance model that provides an a priori answer for the cost of using an in situ approach for a given task would assist in managing the trade-offs between simulation and visualization resources. In this work, we present new statistical performance models, based on algorithmic complexity, that accurately predict the run-time cost of a set of representative rendering algorithms, an essential in situ visualization task. To train and validate the models, we conduct a performance study of an MPI+X rendering infrastructure used in situ with three HPC simulation applications. We then explore feasibility issues using the model for selected in situ rendering questions.

[1]  Robert B. Ross,et al.  A configurable algorithm for parallel image-compositing applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[2]  Gabor Grothendieck,et al.  Lattice: Multivariate Data Visualization with R , 2008 .

[3]  Christopher D. Carothers,et al.  ROSS: a high-performance, low memory, modular time warp system , 2000, PADS '00.

[4]  Markus Wagner,et al.  Interactive Rendering with Coherent Ray Tracing , 2001, Comput. Graph. Forum.

[5]  Ricardo Farias,et al.  ZSWEEP: An Efficient and Exact Projection Algorithm for Unstructured Volume Rendering , 2000, 2000 IEEE Symposium on Volume Visualization (VV 2000).

[6]  Daniel G. Aliaga,et al.  Hybrid simplification: combining multi-resolution polygon and point rendering , 2001, Proceedings Visualization, 2001. VIS '01..

[7]  Dinesh Manocha,et al.  Fast BVH Construction on GPUs , 2009, Comput. Graph. Forum.

[8]  James P. Ahrens,et al.  PISTON: A Portable Cross-Platform Framework for Data-Parallel Visualization Operators , 2012, EGPGV@Eurographics.

[9]  John Shalf,et al.  Performance Modeling for 3D Visualization in a Heterogeneous Computing Environment , 2004 .

[10]  Daniel Sunderland,et al.  Kokkos Array performance-portable manycore programming model , 2012, PMAM '12.

[11]  Kenneth Moreland,et al.  Sandia National Laboratories , 2000 .

[12]  James F. Blinn,et al.  Models of light reflection for computer synthesized pictures , 1977, SIGGRAPH.

[13]  Ingo Wald,et al.  Embree ray tracing kernels for CPUs and the Xeon Phi architecture , 2013, SIGGRAPH '13.

[14]  Robert H. Halstead,et al.  Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming , 1993, ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming.

[15]  Robert B. Ross,et al.  End-to-End Study of Parallel Volume Rendering on the IBM Blue Gene/P , 2008, 2009 International Conference on Parallel Processing.

[16]  Karsten Schwan,et al.  Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS) , 2008, CLADE '08.

[17]  Kellogg S. Booth,et al.  Report from the chair , 1986 .

[18]  Jian Huang,et al.  An image compositing solution at scale , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[19]  Kwan-Liu Ma,et al.  Parallel volume rendering using binary-swap compositing , 1994, IEEE Computer Graphics and Applications.

[20]  Akira Kageyama,et al.  An approach to exascale visualization: Interactive viewing of in-situ visualization , 2013, Comput. Phys. Commun..

[21]  Robert Sisneros,et al.  EAVL: The Extreme-scale Analysis and Visualization Library , 2012, EGPGV@Eurographics.

[22]  Scott Klasky,et al.  Examples of in transit visualization , 2011, PDAC '11.

[23]  S. Boulos,et al.  Adaptive ray packet reordering , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[24]  Kwan-Liu Ma,et al.  Volume rendering with data parallel visualization frameworks for emerging high performance computing architectures , 2015, SIGGRAPH Asia Visualization in High Performance Computing.

[25]  U. Neumann Parallel volume-rendering algorithm performance on mesh-connected multicomputers , 1993, Proceedings of 1993 IEEE Parallel Rendering Symposium.

[26]  Timo Aila,et al.  Understanding the efficiency of ray traversal on GPUs , 2009, High Performance Graphics.

[27]  H. Akima,et al.  Interpolation of Irregularly and Regularly Spaced Data , 2015 .

[28]  Kenneth Moreland,et al.  Tetrahedral projection using vertex shaders , 2002, Symposium on Volume Visualization and Graphics, 2002. Proceedings. IEEE / ACM SIGGRAPH.

[29]  E. Wes Bethel,et al.  MPI-hybrid Parallelism for Volume Rendering on Large, Multi-core Systems , 2010, EGPGV@Eurographics.

[30]  Donald S. Fussell,et al.  Exploring the Spectrum of Dynamic Scheduling Algorithms for Scalable Distributed-MemoryRay Tracing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[31]  Nathan Bell,et al.  Thrust: A Productivity-Oriented Library for CUDA , 2012 .

[32]  James P. Ahrens,et al.  An Image-Based Approach to Extreme Scale in Situ Visualization and Analysis , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[33]  E. Wes Bethel,et al.  High Performance Visualization - Enabling Extreme-Scale Scientific Insight , 2012, High Performance Visualization.

[34]  Hank Childs,et al.  Volume Rendering Via Data-Parallel Primitives , 2015, EGPGV@EuroVis.

[35]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[36]  Jeremy S. Meredith,et al.  Parallel in situ coupling of simulation with a fully featured visualization system , 2011, EGPGV '11.

[37]  Kenneth Moreland,et al.  Sort-last parallel rendering for viewing extremely large data sets on tile displays , 2001, Proceedings IEEE 2001 Symposium on Parallel and Large-Data Visualization and Graphics (Cat. No.01EX520).

[38]  Marc Levoy,et al.  Display of surfaces from volume data , 1988, IEEE Computer Graphics and Applications.

[39]  Andreas Dietrich,et al.  Spatial splits in bounding volume hierarchies , 2009, High Performance Graphics.

[40]  Torsten Hoefler,et al.  Performance modeling for systematic performance tuning , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[41]  Kwan-Liu Ma,et al.  A Scalable, Hybrid Scheme for Volume Rendering Massive Data Sets y , 2022 .

[42]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[43]  Steven M. Drucker,et al.  A data parallel algorithm for raytracing of heterogeneous databases , 1992 .

[44]  Ian Karlin,et al.  LULESH 2.0 Updates and Changes , 2013 .

[45]  Hank Childs,et al.  VisIt: An End-User Tool for Visualizing and Analyzing Very Large Data , 2011 .

[46]  Pat Hanrahan,et al.  Rendering complex scenes with memory-coherent ray tracing , 1997, SIGGRAPH.

[47]  Ingo Wald,et al.  Embree: a kernel framework for efficient CPU ray tracing , 2014, ACM Trans. Graph..

[48]  Alexander Reshetov,et al.  Multi-level ray tracing algorithm , 2005, ACM Trans. Graph..

[49]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[50]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[51]  David K. McAllister,et al.  OptiX: a general purpose ray tracing engine , 2010, ACM Trans. Graph..

[52]  Benjamin Lorendeau,et al.  In-Situ visualization in fluid mechanics using Catalyst: A case study for Code Saturne , 2013, 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV).

[53]  Utkarsh Ayachit,et al.  The ParaView Visualization Application , 2012, High Performance Visualization.

[54]  Eric Darve,et al.  Liszt: A domain specific language for building portable mesh-based PDE solvers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[55]  Timo Aila,et al.  Fast parallel construction of high-quality bounding volume hierarchies , 2013, HPG '13.

[56]  S.G. Parker,et al.  Design for Parallel Interactive Ray Tracing Systems , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.

[57]  Cláudio T. Silva,et al.  Hardware-assisted visibility sorting for unstructured volume rendering , 2005, IEEE Transactions on Visualization and Computer Graphics.

[58]  Hank Childs,et al.  Ray tracing within a data parallel framework , 2015, 2015 IEEE Pacific Visualization Symposium (PacificVis).

[59]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[60]  Kwan-Liu Ma,et al.  Dax Toolkit: A proposed framework for data analysis and visualization at Extreme Scale , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[61]  James P. Ahrens,et al.  A Study of Ray Tracing Large-scale Scientific Data in Two Widely Used Parallel Visualization Applications , 2012, EGPGV@Eurographics.

[62]  G. Bryan,et al.  Introducing Enzo, an AMR Cosmology Application , 2004, astro-ph/0403044.

[63]  Charles T. Loop,et al.  Fast Ray Sorting and Breadth‐First Packet Traversal for GPU Ray Tracing , 2010, Comput. Graph. Forum.

[64]  Bruce Jacob,et al.  The structural simulation toolkit , 2006, PERV.

[65]  Kelly P. Gaither,et al.  Ray tracing and volume rendering large molecular data on multi-core and many-core architectures , 2013, UltraVis@SC.

[66]  Cláudio T. Silva,et al.  Simple, Fast, and Robust Ray Casting of Irregular Grids , 1997, Scientific Visualization Conference (dagstuhl '97).

[67]  Kelly P. Gaither,et al.  RBF Volume Ray Casting on Multicore and Manycore CPUs , 2014, Comput. Graph. Forum.

[68]  Hank Childs,et al.  Strawman: A Batch In Situ Visualization and Analysis Infrastructure for Multi-Physics Simulation Codes , 2015, ISAV@SC.

[69]  Pat Hanrahan,et al.  Volume Rendering , 2020, Definitions.

[70]  Sébastien Jourdain,et al.  In Situ MPAS-Ocean Image-based Visualization , 2014 .

[71]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[72]  Lesley Northam,et al.  HORT: Hadoop online ray tracing with mapreduce , 2011, SIGGRAPH '11.

[73]  Stephen A. Jarvis,et al.  CloverLeaf: Preparing Hydrodynamics Codes for Exascale , 2013 .

[74]  Tero Karras,et al.  Maximizing parallelism in the construction of BVHs, octrees, and k-d trees , 2012, EGGH-HPG'12.

[75]  C. C. Law,et al.  ParaView: An End-User Tool for Large-Data Visualization , 2005, The Visualization Handbook.

[76]  Steven P. Callahan,et al.  A Survey of GPU-Based Volume Rendering of Unstructured Grids , 2005, RITA.

[77]  E. Wes Bethel,et al.  Hybrid Parallelism for Volume Rendering on Large-, Multi-, and Many-Core Systems , 2012, IEEE Transactions on Visualization and Computer Graphics.

[78]  William J. Schroeder,et al.  Research Challenges for Visualization Software , 2012, Computer.

[79]  William E. Lorensen,et al.  The design and implementation of an object-oriented toolkit for 3D graphics and visualization , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[80]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[81]  Christina Freytag,et al.  Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .

[82]  S. Boulos,et al.  Getting rid of packets - Efficient SIMD single-ray traversal using multi-branching BVHs - , 2008, 2008 IEEE Symposium on Interactive Ray Tracing.

[83]  Thomas W. Crockett,et al.  An Introduction to Parallel Rendering , 1997, Parallel Comput..

[84]  Michael E. Papka,et al.  Performance Modeling of vl3 Volume Rendering on GPU-Based Clusters , 2014, EGPGV@EuroVis.

[85]  Kwan-Liu Ma,et al.  VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures , 2016, IEEE Computer Graphics and Applications.

[86]  Helgi Adalsteinsson,et al.  Using simulation to design extremescale applications and architectures: programming model exploration , 2011, PERV.

[87]  Robert B. Ross,et al.  Using MPI-2: Advanced Features of the Message Passing Interface , 2003, CLUSTER.

[88]  Prabhat,et al.  Extreme Scaling of Production Visualization Software on Diverse Architectures , 2010, IEEE Computer Graphics and Applications.

[89]  Ross J. Roeser,et al.  Updates and changes , 2012 .

[90]  Franck Cappello,et al.  Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O , 2012, 2012 IEEE International Conference on Cluster Computing.

[91]  Jeffrey S. Vetter,et al.  Aspen: A domain specific language for performance modeling , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[92]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[93]  Pedro M. Valero-Mora,et al.  ggplot2: Elegant Graphics for Data Analysis , 2010 .

[94]  Kenneth D. Moreland,et al.  IceT users' guide and reference. , 2009 .

[95]  Rudy Lauwereins,et al.  3D graphics rendering time modeling and control for mobile terminals , 2004, Web3D '04.

[96]  Richard D. Hornung,et al.  The RAJA Portability Layer: Overview and Status , 2014 .

[97]  Kwan-Liu Ma,et al.  Multi-GPU volume rendering using MapReduce , 2010, HPDC '10.

[98]  Kwan-Liu Ma,et al.  In Situ Visualization at Extreme Scale: Challenges and Opportunities , 2009, IEEE Computer Graphics and Applications.

[99]  Charles D. Hansen,et al.  GLuRay: Enhanced Ray Tracing in Existing Scientific Visualization Applications using OpenGL Interception , 2012, EGPGV@Eurographics.

[100]  Samuli Laine,et al.  High-performance software rasterization on GPUs , 2011, HPG '11.

[101]  P. Shirley,et al.  A polygonal approximation to direct scalar volume rendering , 1990, VVS.

[102]  Kevin Skadron,et al.  Dymaxion: Optimizing memory access patterns for heterogeneous systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[103]  M. Pharr,et al.  ispc: A SPMD compiler for high-performance CPU programming , 2012, 2012 Innovative Parallel Computing (InPar).