Toward GPU accelerated topology optimization on unstructured meshes

The present work investigates the feasibility of finite element methods and topology optimization for unstructured meshes in massively parallel computer architectures, more specifically on Graphics Processing Units or GPUs. Challenges in the parallel implementation, like the parallel assembly race condition, are discussed and solved with simple algorithms, in this case greedy graph coloring. The parallel implementation for every step involved in the topology optimization process is benchmarked and compared against an equivalent sequential implementation. The ultimate goal of this work is to speed up the topology optimization process by means of parallel computing using off-the-shelf hardware. Examples are compared with both a standard sequential version of the implementation and a massively parallel version to better illustrate the advantages and disadvantages of this approach.

[1]  Eric Darve,et al.  Assembly of finite element methods on graphics processors , 2011 .

[2]  G. Rozvany Topology optimization in structural mechanics , 2001 .

[3]  Marcelo Gattass,et al.  Node and element resequencing using the laplacian of a finite element graph: part i---general concep , 1994 .

[4]  Enrique S. Quintana-Ortí,et al.  The Implementation of BLAS for Band Matrices , 2007, PPAM.

[5]  E. Cuthill,et al.  Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.

[6]  Attila Kakay,et al.  Speedup of FEM Micromagnetic Simulations With Graphical Processing Units , 2010, IEEE Transactions on Magnetics.

[7]  A. Michell LVIII. The limits of economy of material in frame-structures , 1904 .

[8]  Markus Clemens,et al.  GPU Accelerated Adams–Bashforth Multirate Discontinuous Galerkin FEM Simulation of High-Frequency Electromagnetic Fields , 2010, IEEE Transactions on Magnetics.

[9]  Mary Frecker,et al.  Topology optimization of 2D continua for minimum compliance using parallel computing , 2006 .

[10]  Suvranu De,et al.  GPU accelerated fast FEM deformation simulation , 2008, APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems.

[11]  Volker Schulz,et al.  A 2589 line topology optimization code written for the graphics card , 2011, Comput. Vis. Sci..

[12]  Leonid Oliker,et al.  Parallelization of a Dynamic Unstructured Algorithm Using Three Leading Programming Paradigms , 2000, IEEE Trans. Parallel Distributed Syst..

[13]  A. Peressini,et al.  The Mathematics Of Nonlinear Programming , 1988 .

[14]  Norman E. Gibbs,et al.  A Comparison of Several Bandwidth and Profile Reduction Algorithms , 1976, TOMS.

[15]  Enrique S. Quintana-Ortí,et al.  Cholesky Factorization of Band Matrices Using Multithreaded BLAS , 2006, PARA.

[16]  K. Matsui,et al.  Continuous approximation of material distribution for topology optimization , 2004 .

[17]  A. H. Sherman,et al.  Comparative Analysis of the Cuthill–McKee and the Reverse Cuthill–McKee Ordering Algorithms for Sparse Matrices , 1976 .

[18]  Ana F. P. Camargos,et al.  3D parallel conjugate gradient solver optimized for GPUs , 2010, Digests of the 2010 14th Biennial IEEE Conference on Electromagnetic Field Computation.

[19]  M. Bendsøe,et al.  Topology Optimization: "Theory, Methods, And Applications" , 2011 .

[20]  Jack J. Dongarra,et al.  Dense linear algebra solvers for multicore with GPU accelerators , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[21]  Kumar Vemaganti,et al.  Parallel methods for topology optimization , 2004 .

[22]  M. Bendsøe,et al.  Material interpolation schemes in topology optimization , 1999 .

[23]  Ludek Kucera,et al.  The Greedy Coloring Is a Bad Probabilistic Algorithm , 1991, J. Algorithms.

[24]  R. Haftka,et al.  Elements of Structural Optimization , 1984 .

[25]  M. Bendsøe Optimal shape design as a material distribution problem , 1989 .

[26]  David P. Dailey Uniqueness of colorability and colorability of planar 4-regular graphs are NP-complete , 1980, Discret. Math..

[27]  J. Demmel,et al.  Sun Microsystems , 1996 .

[28]  Kumar Vemaganti,et al.  Parallel methods for optimality criteria-based topology optimization , 2005 .

[29]  James Demmel,et al.  Benchmarking GPUs to tune dense linear algebra , 2008, HiPC 2008.

[30]  M. Bendsøe,et al.  Generating optimal topologies in structural design using a homogenization method , 1988 .

[31]  William G. Poole,et al.  An algorithm for reducing the bandwidth and profile of a sparse matrix , 1976 .

[32]  Ole Sigmund,et al.  A 99 line topology optimization code written in Matlab , 2001 .

[33]  Tomas Zegard Latrach Topology Optimization with Unstructured Meshes on Graphics Processing Units (GPUs) , 2011 .

[34]  Alex Pothen,et al.  What Color Is Your Jacobian? Graph Coloring for Computing Derivatives , 2005, SIAM Rev..

[35]  Glaucio H. Paulino,et al.  Node and element resequencing using the Laplacian of a finite element graph: Part II—Implementation and numerical results , 1994 .

[36]  T. E. Bruns,et al.  A reevaluation of the SIMP method with filtering and an alternative formulation for solid–void topology optimization , 2005 .

[37]  Murat Efe Guney,et al.  High-performance direct solution of finite element problems on multi-core processors , 2010 .

[38]  Michal Mrozowski,et al.  Jacobi and Gauss-Seidel preconditioned complex conjugate gradient method with GPU acceleration for finite element method , 2010, The 40th European Microwave Conference.