论文信息 - A memory-optimized visualization system for limited-bandwidth multiprocessing environments

A memory-optimized visualization system for limited-bandwidth multiprocessing environments

Object dataflow is a popular approach used in parallel rendering. The data representing the 3D scene is statically distributed among processors and objects are fetched and cached only on demand. Most previous object dataflow methods were implemented on shared memory architectures and exploited spatial coherency to reduce hardware cache misses. We propose an efficient model for object dataflow parallel volume rendering on message passing machines. The algorithm is introduced and its ray storage mechanism is used to support latency hiding by postponing computation on inactive rays. Memory usage is optimized by letting objects migrate and replicate at different processors rather than the common static assignments. Our cache only memory approach uses a distributed directory scheme to trace the location of objects at other nodes. A mechanism to minimize network congestion was implemented which optimizes channel utilization. Unlike previous methods, our approach can benefit from temporal coherence and effectively minimizes communication costs during animation on limited bandwidth multiprocessing environments. We report results of the algorithm's implementation on several platforms like Cray T3D, Convex SPP and DEC alpha cluster of workstations (COWs), and achieved higher efficiency and scalability than existing algorithms.

Roni Yagel | Asish Law

[1] Seth Copen Goldstein,et al. Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[2] Philippe Lacroute,et al. Real-time volume rendering on shared memory multiprocessors using the shear-warp factorization , 1995 .

[3] Marc Levoy,et al. Volume rendering on scalable shared-memory MIMD architectures , 1992, VVS.

[4] Minesh B. Amin,et al. Fast volume rendering using an efficient, scalable parallel formulation of the shear-warp algorithm , 1995, PRS.

[5] Gavin S. P. Miller,et al. Hierarchical Z-buffer visibility , 1993, SIGGRAPH.

[6] Anoop Gupta,et al. Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[7] Derek J. Paddon,et al. Exploiting coherence for multiprocessor ray tracing , 1989, IEEE Computer Graphics and Applications.

[8] Roni Yagel,et al. CellFlow: A Parallel Rendering Scheme for Distributed Memory Architectures , 1995, PDPTA.

[9] Seth Copen Goldstein,et al. Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[10] B. Corrie,et al. Parallel volume rendering and data coherence , 1993, Proceedings of 1993 IEEE Parallel Rendering Symposium.

[11] Arie E. Kaufman. Volume visualization , 1996, CSUR.