A LA-COMA Implementation of Parallel Volume Rendering

Object dataflow is a popular approach often used for parallel rendering. The scene is statically distributed among processors and objects are fetched and cached only on demand. Most previous object dataflow methods were implemented on shared memory architectures, and exploit image/object space coherency to reduce cache misses. In this paper, we propose an efficient object dataflow incremental rotation system on distributed memory machines. It uses a distributed-directory scheme to trace the location of objects at other nodes. The objects migrate and replicate at different processors as in Cache-Only Memory Architectures (COMA). During the animation process, the processors predict and prefetch the data that will be needed for subsequent frame generations, thus employing a look-ahead (LA) data acquisition to hide the latency of communication. Load balancing, minimizing network congestion, and optimal algorithm embedding are some of the other issues considered in the design process. The results on the Cray T3D show good load balancing and significant

[1]  T. Joe,et al.  Evaluating the memory overhead required for COMA architectures , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[2]  Tadao Nakamura,et al.  Load balancing strategies for a parallel ray-tracing system based on constant subdivision , 2005, The Visual Computer.

[3]  Marc Levoy,et al.  Volume rendering on scalable shared-memory MIMD architectures , 1992, VVS.

[4]  Brian Wyvill,et al.  Multiprocessor Ray Tracing , 1986, Comput. Graph. Forum.

[5]  Thierry Priol,et al.  Ray tracing on distributed memory parallel computers : strategies for distributing computations and data , 1990 .

[6]  Roni Yagel,et al.  VoxelFlow: A Parallel Volume Rendering Method for Scientific Visualization , 1995 .

[7]  Anoop Gupta,et al.  Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[8]  Erik Hagersten,et al.  The Cache Coherence Protocol of the Data Diffusion Machine , 1989, PARLE.

[9]  Mark A. Z. Dippé,et al.  An adaptive subdivision algorithm and parallel architecture for realistic image synthesis , 1984, SIGGRAPH.

[10]  Lee Westover,et al.  Splatting: a parallel, feed-forward volume rendering algorithm , 1991 .

[11]  H. Kubota,et al.  Effective Parallel Processing for Synthesizing Continuous Images , 1989 .

[12]  Marc Levoy,et al.  Display of surfaces from volume data , 1988, IEEE Computer Graphics and Applications.

[13]  Derek J. Paddon,et al.  Exploiting coherence for multiprocessor ray tracing , 1989, IEEE Computer Graphics and Applications.

[14]  Roni Yagel,et al.  CellFlow: A Parallel Rendering Scheme for Distributed Memory Architectures , 1995, PDPTA.