High-Performance Computation of Bézier Surfaces on Parallel and Heterogeneous Platforms

Bézier surfaces are mathematical tools employed in a wide variety of applications. Some works in the literature propose parallelization strategies to improve performance for the computation of Bézier surfaces. These approaches, however, are mainly focused on graphics applications and often are not directly applicable to other domains. In this work, we propose a new method for the computation of Bézier surfaces, together with approaches to efficiently map the method onto different platforms (CPUs, discrete and integrated GPUs). Additionally, we explore CPU–GPU cooperation mechanisms for computing Bézier surfaces using two integrated heterogeneous systems with different characteristics. An exhaustive performance evaluation—including different data-types, rendering and several hardware platforms—is performed. The results show that our method achieves speedups as high as 3.12x (double-precision) and 2.47x (single-precision) on CPU, and 3.69x (double-precision) and 13.14x (single-precision) on GPU compared to other methods in the literature. In heterogeneous platforms, the CPU–GPU cooperation increases the performance up to 2.09x with respect to the GPU-only version. Our method and the associated parallelization approaches can be easily employed in domains other than computer-graphics (e.g., image registration, bio-mechanical modeling and flow simulation), and extended to other Bézier formulations and Bézier constructions of higher order than surfaces.

[1]  Montserrat Bóo,et al.  Synthesis of Bézier Surfaces on the GPU , 2010, GRAPP.

[2]  Xuan Yang,et al.  GPU Accelerated 3D Image Deformation Using Thin-Plate Splines , 2014, 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS).

[3]  Hujun Bao,et al.  Automatic shader simplification using surface signal approximation , 2014, ACM Trans. Graph..

[4]  Geng Liu,et al.  Algorithm and Data Optimization Techniques for Scaling to Massively Threaded Systems , 2012, Computer.

[5]  David A. Wood,et al.  GPU Computing Pipeline Inefficiencies and Optimization Opportunities in Heterogeneous CPU-GPU Processors , 2015, 2015 IEEE International Symposium on Workload Characterization.

[6]  Amit Agarwal,et al.  Implementing Cross-Device Atomics in Heterogeneous Processors , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[7]  Martin Reimers,et al.  Real‐Time GPU Silhouette Refinement using Adaptively Blended Bézier Patches , 2008, Comput. Graph. Forum.

[8]  Jieqing Feng,et al.  Real-time accurate free-form deformation in terms of triangular Bézier surfaces , 2014 .

[9]  Elaine Cohen,et al.  Hybrid volume completion with higher-order Bézier elements , 2015, Comput. Aided Geom. Des..

[10]  Nicolás Montés,et al.  A tensor optimization algorithm for Bézier Shape Deformation , 2016, J. Comput. Appl. Math..

[11]  Montserrat Bóo,et al.  Free adaptive tessellation strategy of Bézier surfaces , 2014, 2014 International Conference on Computer Graphics Theory and Applications (GRAPP).

[12]  Christopher Nimsky,et al.  Non-rigid Registration with Use of Hardware-Based 3D Bézier Functions , 2002, MICCAI.

[13]  A. Quarteroni,et al.  Shape optimization for viscous flows by reduced basis methods and free‐form deformation , 2012 .

[14]  Marc Stamminger,et al.  Fast GPU‐based Adaptive Tessellation with CUDA , 2009, Comput. Graph. Forum.

[15]  Antonio J. Peña,et al.  Chai: Collaborative heterogeneous applications for integrated-architectures , 2017, 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[16]  Alistair A. Young,et al.  Creating shape templates for patient specific biventricular modeling in congenital heart disease , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[17]  Javier D. Bruguera,et al.  Hardware support for adaptive tessellation of Bézier surfaces based on local tests , 2007, J. Syst. Archit..

[18]  Yongjie Zhang,et al.  A hybrid variational‐collocation immersed method for fluid‐structure interaction using unstructured T‐splines , 2016 .

[19]  Xiangyu Li,et al.  Hetero-mark, a benchmark suite for CPU-GPU collaborative computing , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[20]  Les A. Piegl,et al.  The NURBS Book , 1995, Monographs in Visual Communication.

[21]  Adarsh Krishnamurthy,et al.  Optimized GPU evaluation of arbitrary degree NURBS curves and surfaces , 2009, Comput. Aided Des..

[22]  Nicholas J. Wright,et al.  Measuring and Understanding Variation in Benchmark Performance , 2009, 2009 DoD High Performance Computing Modernization Program Users Group Conference.

[23]  Iddo Hanniel,et al.  Computing the Hausdorff distance between NURBS surfaces using numerical iteration on the GPU , 2012, Graph. Model..

[24]  Jieqing Feng,et al.  Real-time B-spline Free-Form Deformation via GPU acceleration , 2013, Comput. Graph..

[25]  Margarita Amor,et al.  Interactive rendering of NURBS surfaces , 2014, Comput. Aided Des..

[26]  Rafael Asenjo,et al.  Mapping Streaming Applications on Commodity Multi-CPU and GPU On-Chip Processors , 2016, IEEE Transactions on Parallel and Distributed Systems.

[27]  Graham Sellers,et al.  OpenGL SuperBible: Comprehensive Tutorial and Reference , 2007 .

[28]  Georgios Georgis,et al.  Acceleration techniques and evaluation on multi-core CPU, GPU and FPGA for image processing and super-resolution , 2016, Journal of Real-Time Image Processing.

[29]  David A. Wood,et al.  QuickRelease: A throughput-oriented approach to release consistency on GPUs , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[30]  Bob Wallis Tutorial on forward differencing , 1990 .

[31]  James F. Blinn,et al.  Real-time GPU rendering of piecewise algebraic surfaces , 2006, SIGGRAPH 2006.

[32]  Angel Cobo,et al.  Bézier Curve and Surface Fitting of 3D Point Clouds Through Genetic Algorithms, Functional Networks and Least-Squares Approximation , 2007, ICCSA.

[33]  Javier D. Bruguera,et al.  Adaptive Tessellation of NURBS Surfaces , 2003, WSCG.

[34]  Charles T. Loop,et al.  Real-time view-dependent rendering of parametric surfaces , 2009, I3D '09.

[35]  Les A. Piegl,et al.  The NURBS book (2nd ed.) , 1997 .

[36]  Laxmi N. Bhuyan,et al.  Thread Tranquilizer: Dynamically reducing performance variation , 2012, TACO.

[37]  Bo Li,et al.  GPU Accelerated Non-rigid Registration for the Evaluation of Cardiac Function , 2008, MICCAI.

[38]  Reinhard Klein,et al.  GPU-based trimming and tessellation of NURBS and T-Spline surfaces , 2005, SIGGRAPH 2005.

[39]  Jean-Antoine Désidéri,et al.  Free-form-deformation parameterization for multilevel 3D shape optimization in aerodynamics , 2003 .

[40]  Chang Xu,et al.  Tiling for Performance Tuning on Different Models of GPUs , 2009, 2009 Second International Symposium on Information Science and Engineering.

[41]  Jesús Jiménez,et al.  Three‐dimensional thinning algorithms on graphics processing units and multicore CPUs , 2012, Concurr. Comput. Pract. Exp..

[42]  Guozhao Wang,et al.  Optimized design of Bézier surface through Bézier geodesic quadrilateral , 2015, J. Comput. Appl. Math..

[43]  Bandar Seri Iskandar,et al.  Bézier Triangular Patches for Closed Surface , 2014 .

[44]  Christopher Dyken,et al.  State-of-the-art in heterogeneous computing , 2010, Sci. Program..