Real-time volume rendering on shared memory multiprocessors using the shear-warp factorization

This paper presents a new parallel volume rendering algorithm that can render 2563 voxel medical data sets at over 10 Hz and 1283 voxel data sets at over 30 Hz on a 16-processor Silicon Graphics Challenge. The algorithm achieves these results by minimizing each of the three components of execution time: computation time, synchronization time, and data communication time. Computation time is low because the parallel algorithm is based on the recentlyreported shear-warp serial volume rendering algorithm which is over five times faster than previous serial algorithms. Synchronization time is minimized by using dynamic load balancing and a task partition that minimizes synchronization events. Data communication costs are low because the algorithm is implemented for sharedmemory multiprocessors, a class of machines with hardware support for low-latency fine-grain communication and hardware caching to hide latency. We draw two conclusions from our implementation. First, we find that on shared-memory architectures data redistribution and communication costs do not dominate rendering time. Second, we find that cache locality requirements impose a limit on parallelism in volume rendering algorithms. Specifically, our results indicate that shared-memory machines with hundreds of processors would be useful only for rendering very large data sets. CR Categories: D.1.3 [Concurrent Programming]: Parallel Programming; 1.3.3 [Computer Graphics]: Picture/Image Generation--Display Algorithms; L3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism. Additional

[1]  Lih-Shyang Chen,et al.  A dynamic screen technique for shaded graphics display of slice-represented objects , 1987, Comput. Vis. Graph. Image Process..

[2]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[3]  Marc Levoy,et al.  Efficient ray tracing of volume data , 1990, TOGS.

[4]  Roberto Scopigno,et al.  Rendering volumetric data using STICKS representation scheme , 1990, SIGGRAPH 1990.

[5]  C. Montani,et al.  Rendering volumetric data using STICKS representation scheme , 1990, VVS.

[6]  Pat Hanrahan,et al.  Hierarchical splatting: a progressive refinement algorithm for volume rendering , 1991, SIGGRAPH.

[7]  Brian Wyvill,et al.  Parallel Volume Rendering on a Shared-Memory Multiprocessor , 1992 .

[8]  Gordon Stoll,et al.  Data parallel volume rendering as line drawing , 1992, VVS.

[9]  Philip K. Robertson,et al.  Volume rendering on the MasPar MP-1 , 1992, VVS.

[10]  Ross T. Whitaker,et al.  Direct visualization of volume data , 1992, IEEE Computer Graphics and Applications.

[11]  Marc Levoy,et al.  Volume rendering on scalable shared-memory MIMD architectures , 1992, VVS.

[12]  Margaret Martonosi,et al.  MemSpy: analyzing memory system bottlenecks in programs , 1992, SIGMETRICS '92/PERFORMANCE '92.

[13]  Raffaele Perego,et al.  Parallel volume visualization on a hypercube architecture , 1992, VVS.

[14]  William M. Hsu Segmented ray casting for data parallel volume rendering , 1993 .

[15]  Hanspeter Pfister,et al.  Real-Time Architecture for High Resolution Volume Visualization , 1993, Workshop on Graphics Hardware.

[16]  Paul Mackerras,et al.  Parallel volume rendering and data coherence , 1993 .

[17]  Charles D. Hansen,et al.  A data distributed, parallel algorithm for ray-traced volume rendering , 1993 .

[18]  Ulrich Neumann Parallel volume-rendering algorithm performance on mesh-connected multicomputers , 1993 .

[19]  Arun K. Somani,et al.  Permutation warping for data parallel volume rendering , 1993 .

[20]  Anoop Gupta,et al.  The DASH Prototype: Logic Overhead and Performance , 1993, IEEE Trans. Parallel Distributed Syst..

[21]  Tzi-cker Chiueh,et al.  Cube-3: a real-time architecture for high-resolution volume visualization , 1993, VVS '94.

[22]  Brian Cabral,et al.  Accelerated volume rendering and tomographic reconstruction using texture mapping hardware , 1994, VVS '94.

[23]  M. Levoy,et al.  Fast volume rendering using a shear-warp factorization of the viewing transformation , 1994, SIGGRAPH.

[24]  Wolfgang Straßer,et al.  A compact volume rendering accelerator , 1994, VVS '94.

[25]  Marc Levoy,et al.  Parallel visualization algorithms: performance and architectural implications , 1994, Computer.