Eurographics Symposium on Parallel Graphics and Visualization (2011) Optimal Multi-image Processing Streaming Framework on Parallel Heterogeneous Systems †

Atlas construction is an important technique in medical image analysis that plays a central role in understanding the variability of brain anatomy. The construction often requires applying image processing operations to multiple images (often hundreds of volumetric datasets), which is challenging in computational power as well as memory requirements. In this paper we introduce MIP, a Multi-Image Processing streaming framework to harness the processing power of heterogeneous CPU/GPU systems. In MIP we introduce specially designed streaming algorithms and data structures that provides an optimal solution for out-of-core multi-image processing problems both in terms of memory usage and computational efficiency. MIP makes use of the asynchronous execution mechanism supported by parallel heterogeneous systems to efficiently hide the inherent latency of the processing pipeline of out-of-core approaches. Consequently, with computationally intensive problems, the MIP out-of-core solution could achieve the same performance as the in-core solution. We demonstrate the efficiency of the MIP framework on synthetic and real datasets.

[1]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[2]  Steven M. Seitz,et al.  Finding paths through the world's photos , 2008, SIGGRAPH 2008.

[3]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[4]  Changjun Hu,et al.  Transforming the Adaptive Irregular Out-of-Core Applications for Hiding Communication and Disk I/O , 2007, OTM Conferences.

[5]  Joseph Ross Mitchell,et al.  A work-efficient GPU algorithm for level set segmentation , 2010, HPG '10.

[6]  Jill Macdonald Boyce,et al.  Noise reduction of image sequences using adaptive motion compensated frame averaging , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  E ChristensenGary,et al.  Individualizing Neuroanatomical Atlases Using a Massively Parallel Computer , 1996 .

[8]  Eddy Caron,et al.  Out-of-core and pipeline techniques for wavefront algorithms , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[9]  V. Michael Bove,et al.  Cheops: a reconfigurable data-flow system for video processing , 1995, IEEE Trans. Circuits Syst. Video Technol..

[10]  Renato Pajarola,et al.  Out-Of-Core Algorithms for Scientific Visualization and Computer Graphics , 2002 .

[11]  Anand Raghunathan,et al.  A framework for efficient and scalable execution of domain-specific templates on GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[12]  Todd C. Mowry,et al.  Tolerating latency through software-controlled data prefetching , 1994 .

[13]  Guido Gerig,et al.  Unbiased diffeomorphic atlas construction for computational anatomy , 2004, NeuroImage.

[14]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[15]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  P. Thomas Fletcher,et al.  Population Shape Regression From Random Design Data , 2007, ICCV.

[17]  Linh K. Ha,et al.  Multiscale Unbiased Diffeomorphic Atlas Construction on Multi-GPUs , 2011 .

[18]  Michael I. Miller,et al.  Individualizing Neuroanatomic Atlases Using a Massively Parallel Computer , 1996, Computer.

[19]  A. M. Alattar A probabilistic filter for eliminating temporal noise in time-varying image sequences , 1992, [Proceedings] 1992 IEEE International Symposium on Circuits and Systems.

[20]  William J. Dally,et al.  A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[21]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[22]  Ken Kennedy,et al.  A model and compilation strategy for out-of-core data parallel programs , 1995, PPOPP '95.

[23]  Jens H. Krüger,et al.  Fast Parallel Unbiased Diffeomorphic Atlas Construction on Multi-Graphics Processing Units , 2009, EGPGV@Eurographics.

[24]  P. Thomas Fletcher,et al.  Population Shape Regression from Random Design Data , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  Michael Wimmer,et al.  Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful , 2004, Comput. Graph. Forum.

[26]  Todd C. Mowry,et al.  Compiler-based I/O prefetching for out-of-core applications , 2001, TOCS.

[27]  Frederic Dufaux,et al.  Motion estimation techniques for digital TV: a review and a new contribution , 1995, Proc. IEEE.

[28]  David S. Greenberg,et al.  Out of core, out of mind: practical parallel I/O , 1993, Proceedings of Scalable Parallel Libraries Conference.

[29]  Richard Szeliski,et al.  Finding paths through the world's photos , 2008, ACM Trans. Graph..

[30]  Hans Knutsson,et al.  Phase based volume registration using cuda , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Todd C. Mowry,et al.  Automatic compiler-inserted I/O prefetching for out-of-core applications , 1996, OSDI '96.