P-cloth

We present a novel parallel algorithm for cloth simulation that exploits multiple GPUs for fast computation and the handling of very high resolution meshes. To accelerate implicit integration, we describe new parallel algorithms for sparse matrix-vector multiplication (SpMV) and for dynamic matrix assembly on a multi-GPU workstation. Our algorithms use a novel work queue generation scheme for a fat-tree GPU interconnect topology. Furthermore, we present a novel collision handling scheme that uses spatial hashing for discrete and continuous collision detection along with a non-linear impact zone solver. Our parallel schemes can distribute the computation and storage overhead among multiple GPUs and enable us to perform almost interactive simulation on complex cloth meshes, which can hardly be handled on a single GPU due to memory limitations. We have evaluated the performance with two multi-GPU workstations (with 4 and 8 GPUs, respectively) on cloth meshes with 0.5 -- 1.65M triangles. Our approach can reliably handle the collisions and generate vivid wrinkles and folds at 2 -- 5 fps, which is significantly faster than prior cloth simulation systems. We observe almost linear speedups with respect to the number of GPUs.

[1]  Miguel A. Otaduy,et al.  Yarn-level simulation of woven cloth , 2014, ACM Trans. Graph..

[2]  Xu Guo,et al.  Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs , 2013, Comput. Vis. Sci..

[3]  Dinesh Manocha,et al.  CAMA: Contact‐Aware Matrix Assembly with Unified Collision Handling for GPU‐based Cloth Simulation , 2016, Comput. Graph. Forum.

[4]  David Eberle Better collisions and faster cloth for Pixar's Coco , 2018, SIGGRAPH 2018.

[5]  Stephane Redon,et al.  Continuous Collision Detection , 2008 .

[6]  Dinesh Manocha,et al.  PSCC: Parallel Self-Collision Culling with Spatial Hashing on GPUs , 2018, PACMCGIT.

[7]  Jie Li,et al.  ADMM ⊇ Projective Dynamics: Fast Simulation of Hyperelastic Models with Dynamic Constraints , 2017, IEEE Trans. Vis. Comput. Graph..

[8]  François Faure,et al.  Parallel Simulation of Large Dynamic System on a PC Cluster: Application to Cloth Simulation , 2004 .

[9]  Theodore Kim,et al.  Clean cloth inputs: removing character self-intersections with volume simulation , 2018, SIGGRAPH 2018.

[10]  Xuchen Han,et al.  A material point method for thin shells with frictional contact , 2018, ACM Trans. Graph..

[11]  Eitan Grinspun,et al.  Robust treatment of simultaneous collisions , 2008, ACM Trans. Graph..

[12]  Jieyu Chu,et al.  A schur complement preconditioner for scalable parallel fluid simulation , 2017, TOGS.

[13]  Laxmikant V. Kalé,et al.  Scalable Asynchronous Contact Mechanics Using Charm++ , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[14]  Jack J. Dongarra,et al.  Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs , 2015, SIAM J. Sci. Comput..

[15]  Mickeal Verschoor,et al.  Analysis and performance estimation of the Conjugate Gradient method on multiple GPUs , 2012, Parallel Comput..

[16]  Aimei Kutt Art-directed costumes at pixar: design, tailoring, and simulation in production , 2018, SIGGRAPH 2018.

[17]  Michael Malahe,et al.  PDE solvers for hybrid CPU-GPU architectures , 2016 .

[18]  Dongliang Zhang Cloth design and application , 2005, SIGGRAPH Courses.

[19]  James F. O'Brien,et al.  Adaptive anisotropic remeshing for cloth simulation , 2012, ACM Trans. Graph..

[20]  Andrew P. Witkin,et al.  Large steps in cloth simulation , 1998, SIGGRAPH.

[21]  Sunghee Choi,et al.  Multi‐Resolution Cloth Simulation , 2010, Comput. Graph. Forum.

[22]  Philip M. Hubbard,et al.  Interactive collision detection , 1993, Proceedings of 1993 IEEE Research Properties in Virtual Reality Symposium.

[23]  Dongsoo Han,et al.  Interleaved Cloth Simulation , 2015, VRIPHYS.

[24]  Changjiang Zhang,et al.  Performance Optimization for SpMV on Multi-GPU Systems Using Threads and Multiple Streams , 2016, 2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW).

[25]  H.-J. Kim GPU Performnace of Conjugate Gradient Solver with Staggered Fermions , 2011 .

[26]  William McCollough,et al.  Multiple-GPU-Based Frequency-Dependent Finite-Difference Time Domain Formulation Using MATLAB Parallel Computing Toolbox , 2017 .

[27]  Eftychios Sifakis,et al.  Globally coupled collision handling using volume preserving impulses , 2008, SCA '08.

[28]  Teresa Krick Solving Polynomial Equations , 2005 .

[29]  Mark Pauly,et al.  Projective dynamics , 2014, ACM Trans. Graph..

[30]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[31]  Huamin Wang,et al.  Data-driven elastic models for cloth: modeling and measurement , 2011, ACM Trans. Graph..

[32]  Xavier Provot,et al.  Deformation Constraints in a Mass-Spring Model to Describe Rigid Cloth Behavior , 1995 .

[33]  Dinesh Manocha,et al.  Fast and exact continuous collision detection with Bernstein sign classification , 2014, ACM Trans. Graph..

[34]  Krishnan Suresh,et al.  A fast matrix-free elasto-plastic solver for predicting residual stresses in additive manufacturing , 2020, Comput. Aided Des..

[35]  Mario Botsch,et al.  Fast Projective Skinning , 2019, MIG.

[36]  Chenfanfu Jiang,et al.  Hierarchical Optimization Time Integration for CFL-Rate MPM Stepping , 2020, ACM Trans. Graph..

[37]  Huamin Wang,et al.  Defending continuous collision detection against errors , 2014, ACM Trans. Graph..

[38]  Hiroshi Okuda,et al.  Conjugate gradients on multiple GPUs , 2010 .

[39]  Markus H. Gross,et al.  Implicit Contact Handling for Deformable Objects , 2009, Comput. Graph. Forum.

[40]  Ahmad Afsahi,et al.  Topology-Aware GPU Selection on Multi-GPU Nodes , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[41]  Ming C. Lin,et al.  Time‐Domain Parallelization for Accelerating Cloth Simulation , 2018, Comput. Graph. Forum.

[42]  Robert Bridson,et al.  Efficient geometrically exact continuous collision detection , 2012, ACM Trans. Graph..

[43]  Jun Wang,et al.  A novel multi–graphics processing unit parallel optimization framework for the sparse matrix‐vector multiplication , 2017, Concurr. Comput. Pract. Exp..

[44]  Eftychios Sifakis,et al.  A scalable schur-complement fluids solver for heterogeneous compute platforms , 2016, ACM Trans. Graph..

[45]  Bailin Deng,et al.  Anderson acceleration for geometry optimization and physics simulation , 2018, ACM Trans. Graph..

[46]  Davide Barbieri,et al.  Sparse Matrix-Vector Multiplication on GPGPUs , 2017, ACM Trans. Math. Softw..

[47]  M. Hutter,et al.  Mesh Partitioning for Parallel Garment Simulation , 2014 .

[48]  Olivier Richard,et al.  CONCURRENCY AND COMPUTATION : PRACTICE AND EXPERIENCE , 2018 .

[49]  Eitan Grinspun,et al.  Asynchronous variational contact mechanics , 2010, 1007.3240.

[50]  Scott B. Baden,et al.  CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters , 2015, 2015 IEEE 18th International Conference on Computational Science and Engineering.

[51]  Eero Vainikko,et al.  Petascale elliptic solvers for anisotropic PDEs on GPU clusters , 2014, ArXiv.

[52]  David Harmon,et al.  Robust treatment of simultaneous collisions , 2008, SIGGRAPH 2008.

[53]  Dinesh Manocha,et al.  Algorithms for intersecting parametric and algebraic curves I: simple intersections , 1994, TOGS.

[54]  Jun Zhou,et al.  Multi-GPU Implementation of a 3D Finite Difference Time Domain Earthquake Code on Heterogeneous Supercomputers , 2013, ICCS.

[55]  Dinesh Manocha,et al.  I-cloth , 2018, ACM Trans. Graph..

[56]  James F. O'Brien,et al.  Fast simulation of mass-spring systems , 2013, ACM Trans. Graph..

[57]  Chenfanfu Jiang,et al.  Anisotropic elastoplasticity for cloth, knit and hair frictional contact , 2017, ACM Trans. Graph..

[58]  Dinesh Manocha,et al.  Interactive collision detection between deformable models using chromatic decomposition , 2005, SIGGRAPH 2005.

[59]  Satoshi Matsuoka,et al.  Fast Conjugate Gradients with Multiple GPUs , 2009, ICCS.

[60]  Robert Strzodka,et al.  Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..

[61]  Tae-Yong Kim,et al.  Air meshes for robust collision handling , 2015, ACM Trans. Graph..

[62]  Tiantian Liu,et al.  Quasi-newton methods for real-time simulation of hyperelastic materials , 2017, TOGS.

[63]  Xavier Provot,et al.  Collision and self-collision handling in cloth model dedicated to design garments , 1997, Computer Animation and Simulation.

[64]  Andrew P. Witkin,et al.  Untangling cloth , 2003, ACM Trans. Graph..

[65]  Dinesh Manocha,et al.  A massively parallel and scalable multi-GPU material point method , 2020, ACM Trans. Graph..

[66]  Min Tang,et al.  A Unified Cloth Untangling Framework Through Discrete Collision Detection , 2017, Comput. Graph. Forum.

[67]  Huamin Wang,et al.  Parallel Multigrid for Nonlinear Cloth Simulation , 2018, Comput. Graph. Forum.

[68]  Ming C. Lin,et al.  Interactive collision detection between deformable models using chromatic decomposition , 2005, ACM Trans. Graph..

[69]  Miguel A. Otaduy,et al.  Mixing Yarns and Triangles in Cloth Simulation , 2020, Comput. Graph. Forum.

[70]  Wolfgang Straßer,et al.  Fast and Scalable CPU/GPU Collision Detection for Rigid and Deformable Surfaces , 2010, Comput. Graph. Forum.

[71]  Huamin Wang,et al.  Rule-free sewing pattern adjustment with precision and efficiency , 2018, ACM Trans. Graph..

[72]  Ronald Fedkiw,et al.  Ieee Transactions on Visualization and Computer Graphics 1 Robust High-resolution Cloth Using Parallelism, History-based Collisions and Accurate Friction , 2022 .

[73]  Wolfgang Straßer,et al.  A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[74]  Ronald Fedkiw,et al.  Robust treatment of collisions, contact and friction for cloth animation , 2002, SIGGRAPH Courses.