A memory-optimized visualization system for limited-bandwidth multiprocessing environments

Object dataflow is a popular approach used in parallel rendering. The data representing the 3D scene is statically distributed among processors and objects are fetched and cached only on demand. Most previous object dataflow methods were implemented on shared memory architectures and exploited spatial coherency to reduce hardware cache misses. We propose an efficient model for object dataflow parallel volume rendering on message passing machines. The algorithm is introduced and its ray storage mechanism is used to support latency hiding by postponing computation on inactive rays. Memory usage is optimized by letting objects migrate and replicate at different processors rather than the common static assignments. Our cache only memory approach uses a distributed directory scheme to trace the location of objects at other nodes. A mechanism to minimize network congestion was implemented which optimizes channel utilization. Unlike previous methods, our approach can benefit from temporal coherence and effectively minimizes communication costs during animation on limited bandwidth multiprocessing environments. We report results of the algorithm's implementation on several platforms like Cray T3D, Convex SPP and DEC alpha cluster of workstations (COWs), and achieved higher efficiency and scalability than existing algorithms.