Performance engineering to achieve real-time high dynamic range imaging

Image-processing applications like high dynamic range imaging can be done efficiently in the gradient space. For it, the image has to be transformed to gradient space and back. While the forward transformation to gradient space is fast by using simple finite differences, the backward transformation requires the solution of a partial differential equation. Although one can use an efficient multigrid solver for the backward transformation, it shows that a straightforward implementation of the standard algorithm does not lead to satisfactory runtime results for real-time high dynamic range compression of larger 2D X-ray images even on GPUs. Therefore, we do a rigorous performance analysis and derive a performance model for our multigrid algorithm that guides us to an improved implementation, where we achieve an overall performance of more than 25 frames per second for 16.8 Megapixel images doing full high dynamic range compression including data transfers between CPU and GPU. Together with a simple OpenGL visualization it becomes possible to perform real-time parameter studies on medical data sets.

[1]  Samuel Williams,et al.  An auto-tuning framework for parallel multicore stencil computations , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[2]  Ulrich Rüde,et al.  Optimising a 3D multigrid algorithm for the IA-64 architecture , 2008, Int. J. Comput. Sci. Eng..

[3]  採編典藏組 Society for Industrial and Applied Mathematics(SIAM) , 2008 .

[4]  Nancy S. Pollard,et al.  Real-time gradient-domain painting , 2008, ACM Trans. Graph..

[5]  Ulrich Rüde,et al.  Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .

[6]  Ulrich Rüde,et al.  Challenges and Potentials of Emerging Multicore Architectures , 2009 .

[7]  Ulrich Rüde,et al.  Parallel Geometric Multigrid , 2006 .

[8]  Ching-Te Chiu,et al.  Block-Based Gradient Domain High Dynamic Range Compression Design for Real-Time Applications , 2007, 2007 IEEE International Conference on Image Processing.

[9]  Shmuel Peleg,et al.  Seamless Image Stitching in the Gradient Domain , 2004, ECCV.

[10]  Harald Köstler,et al.  A multigrid framework for variational approaches in medical image processing and computer vision , 2008 .

[11]  Jian Sun,et al.  Guided Image Filtering , 2010, ECCV.

[12]  Aslak Tveito,et al.  Numerical solution of partial differential equations on parallel computers , 2006 .

[13]  Justin W. L. Wan,et al.  Practical Fourier analysis for multigrid methods , 2007, Math. Comput..

[14]  Helmar Burkhart,et al.  PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[15]  H. Köstler,et al.  Robust and efficient multigrid techniques for the optical flow problem using different regularizers , 2005 .

[16]  Dani Lischinski,et al.  Gradient Domain High Dynamic Range Compression , 2023 .

[17]  Dietmar Fey,et al.  High Performance Stencil Code Algorithms for GPGPUs , 2011, ICCS.

[18]  Ulrich Rüde,et al.  A fast full multigrid solver for applications in image processing , 2008, Numer. Linear Algebra Appl..

[19]  Wolfgang Hackbusch,et al.  Multi-grid methods and applications , 1985, Springer series in computational mathematics.

[20]  William L. Briggs,et al.  A multigrid tutorial, Second Edition , 2000 .

[21]  D. Brandt,et al.  Multi-level adaptive solutions to boundary-value problems math comptr , 1977 .

[22]  Volker Strumpen,et al.  Cache oblivious stencil computations , 2005, ICS '05.