论文信息 - Enabling Computational Dynamics in Distributed Computing Environments Using a Heterogeneous Computing Template

Enabling Computational Dynamics in Distributed Computing Environments Using a Heterogeneous Computing Template

This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The heterogeneous and distributed computing hardware infrastructure is assumed herein to be made up of a combination of CPUs and GPUs. The computational dynamics applications targeted to execute on such a hardware topology include many-body dynamics, smoothed-particle hydrodynamics (SPH) fluid simulation, and fluid-solid interaction analysis. The underlying theme of the solution approach embraced by HCT is that of partitioning the domain of interest into a number of sub-domains that are each managed by a separate core/accelerator (CPU/GPU) pair. Five components at the core of HCT enable the envisioned distributed computing approach to large-scale dynamical system simulation: (a) a method for the geometric domain decomposition and mapping onto heterogeneous hardware; (b) methods for proximity computation or collision detection; (c) support for moving data among the corresponding hardware as elements move from subdomain to subdomain; (d) numerical methods for solving the specific dynamics problem of interest; and (e) tools for performing visualization and post-processing in a distributed manner. In this contribution the components (a) and (c) of the HCT are demonstrated via the example of the Discrete Element Method (DEM) for rigid body dynamics with friction and contact. The collision detection task required in frictional-contact dynamics; i.e., task (b) above, is discussed separately and in the context of GPU computing. This task is shown to benefit of a two order of magnitude gain in efficiency when compared to traditional sequential implementations. Note: Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not imply its endorsement, recommendation, or favoring by the US Army. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Army, and shall not be used for advertising or product endorsement purposes.Copyright © 2011 by ASME

[1] Anthony Skjellum,et al. Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.

[2] Aaftab Munshi,et al. The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).

[3] Gilbert Hendry,et al. Architectural Exploration of Chip-Scale Photonic Interconnection Network Designs Using Physical-Layer Analysis , 2010, Journal of Lightwave Technology.

[4] P. Cundall,et al. A discrete numerical model for granular assemblies , 1979 .

[5] G. Grest,et al. Granular flow down an inclined plane: Bagnold scaling and rheology. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6] Leslie Greengard,et al. A fast algorithm for particle simulations , 1987 .

[7] Hammad Mazhar,et al. Parallel collision detection of ellipsoids with applications in large scale multibody dynamics , 2012, Math. Comput. Simul..

[8] Toby D. Heyn. SIMULATION OF TRACKED VEHICLES ON GRANULAR TERRAIN LEVERAGING GPU COMPUTING , 2009 .

[9] Hammad Mazhar,et al. A scalable parallel method for large collision detection problems , 2011 .

[10] William Gropp,et al. Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[11] Dan Negrut,et al. On the Use of Meshless Methods in Acoustic Simulations , 2009 .