Solvers on advanced parallel architectures with application to partial differential equations and discrete optimisation
暂无分享,去创建一个
[1] Greg Humphreys,et al. A multigrid solver for boundary value problems using programmable graphics hardware , 2003, HWWS '03.
[2] V. Cung,et al. A scatter search based approach for the quadratic assignment problem , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).
[3] Michael Garland,et al. Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .
[4] Yao Zhang,et al. An Auto-tuned Method for Solving Large Tridiagonal Systems on the GPU , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[5] Rainald Loehner,et al. Overlapping unstructured grids , 2001 .
[6] Eli Upfal,et al. Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems , 1997, IEEE Trans. Parallel Distributed Syst..
[7] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..
[8] Xiaoye S. Li,et al. An overview of SuperLU: Algorithms, implementation, and user interface , 2003, TOMS.
[9] O. C. Zienkiewicz,et al. The Finite Element Method: Its Basis and Fundamentals , 2005 .
[10] Joel H. Ferziger,et al. Computational methods for fluid dynamics , 1996 .
[11] Cleve B. Moler,et al. Iterative Refinement in Floating Point , 1967, JACM.
[12] Liqiang Wang,et al. Auto-Tuning CUDA Parameters for Sparse Matrix-Vector Multiplication on GPUs , 2010, 2010 International Conference on Computational and Information Sciences.
[13] Shiming Yang,et al. The optimal relaxation parameter for the SOR method applied to the Poisson equation in any space dimensions , 2009, Appl. Math. Lett..
[14] Nectarios Koziris,et al. Optimizing sparse matrix-vector multiplication using index and value compression , 2008, CF '08.
[15] Guillaume Caumon,et al. Concurrent number cruncher: a GPU implementation of a general sparse linear solver , 2009, Int. J. Parallel Emergent Distributed Syst..
[17] Inanc Senocak,et al. An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters , 2010 .
[18] Fred W. Glover,et al. A Template for Scatter Search and Path Relinking , 1997, Artificial Evolution.
[19] Uday Bondhugula,et al. Believe it or Not! Multicore CPUs can Match GPUs for FLOP-intensive Applications , 2010 .
[20] Kyle Chand,et al. Component‐based hybrid mesh generation , 2005 .
[21] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[22] Timothy A. Davis,et al. Algorithm 832: UMFPACK V4.3---an unsymmetric-pattern multifrontal method , 2004, TOMS.
[23] Robert Strzodka,et al. Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid , 2011, IEEE Transactions on Parallel and Distributed Systems.
[24] François Bodin,et al. Heterogeneous multicore parallel programming for graphics processing units , 2009 .
[25] J. Gillis,et al. Matrix Iterative Analysis , 1961 .
[26] Stefan Turek,et al. GPU acceleration of an unmodified parallel finite element Navier-Stokes solver , 2009, 2009 International Conference on High Performance Computing & Simulation.
[27] R. LeVeque. Finite Volume Methods for Hyperbolic Problems: Characteristics and Riemann Problems for Linear Hyperbolic Equations , 2002 .
[28] Nair Maria Maia de Abreu,et al. A survey for the quadratic assignment problem , 2007, Eur. J. Oper. Res..
[29] José Ranilla,et al. Neville elimination on multi- and many-core systems: OpenMP, MPI and CUDA , 2011, The Journal of Supercomputing.
[30] Andrew Lumsdaine,et al. Accelerating sparse matrix computations via data compression , 2006, ICS '06.
[31] G. Goertzel. An Algorithm for the Evaluation of Finite Trigonometric Series , 1958 .
[32] Morgan Pickering. An Introduction to Fast Fourier Transform Methods for Partial Differential Equations, with Applications , 1986 .
[33] H. Matthies,et al. Classification and Overview of Meshfree Methods , 2004 .
[34] Helmar Burkhart,et al. General-Purpose Sparse Matrix Building Blocks using the NVIDIA CUDA Technology Platform , 2007 .
[35] F. Rendl,et al. A thermodynamically motivated simulation procedure for combinatorial optimization problems , 1984 .
[36] J. H. Wilkinson. The algebraic eigenvalue problem , 1966 .
[37] A. N. Elshafei,et al. Hospital Layout as a Quadratic Assignment Problem , 1977 .
[38] Chung-Yuan Huang,et al. Recent progress in multiblock hybrid structured and unstructured mesh generation , 1997 .
[39] T. Koopmans,et al. Assignment Problems and the Location of Economic Activities , 1957 .
[40] Michal Czapinski,et al. An effective Parallel Multistart Tabu Search for Quadratic Assignment Problem on CUDA platform , 2013, J. Parallel Distributed Comput..
[41] Bruce Hendrickson,et al. Support Theory for Preconditioning , 2003, SIAM J. Matrix Anal. Appl..
[42] B. Eng,et al. The Use of Parallel Polynomial Preconditioners in the Solution of Systems of Linear Equations , 2005 .
[43] Paulius Micikevicius,et al. 3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.
[44] G. Peters,et al. Iterative refinement of the solution of a positive definite system of equations , 1966 .
[45] Yao Zhang,et al. Fast tridiagonal solvers on the GPU , 2010, PPoPP '10.
[46] Timothy A. Davis,et al. Dynamic Supernodes in Sparse Cholesky Update/Downdate and Triangular Solves , 2009, TOMS.
[47] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[48] Peng Li,et al. Multigrid on GPU: tackling power grid analysis on parallel SIMT platforms , 2008, ICCAD 2008.
[49] Robert Strzodka,et al. Using GPUs to improve multigrid solver performance on a cluster , 2008, Int. J. Comput. Sci. Eng..
[50] Vivek Sarkar,et al. JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA , 2009, Euro-Par.
[51] Nouredine Melab,et al. Parallel Local Search on GPU , 2009 .
[52] Stephen A. Jarvis,et al. Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark , 2011, PERV.
[53] Éric D. Taillard,et al. Robust taboo search for the quadratic assignment problem , 1991, Parallel Comput..
[54] Michael Griebel,et al. Meshfree Methods for Partial Differential Equations , 2002 .
[55] Eugenio Oñate,et al. The meshless finite element method , 2003 .
[56] Yao Zhang,et al. Scan primitives for GPU computing , 2007, GH '07.
[57] J. Ortega,et al. A multi-color SOR method for parallel computation , 1982, ICPP.
[58] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[59] Keith D. Underwood,et al. Analyzing the Impact of Overlap, Offload, and Independent Progress for Message Passing Interface Applications , 2005, Int. J. High Perform. Comput. Appl..
[60] Wen-mei W. Hwu,et al. MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs , 2008, LCPC.
[61] Chi-Bang Kuan,et al. Automated Empirical Optimization , 2011, Encyclopedia of Parallel Computing.
[62] Jack J. Dongarra,et al. An Improved Magma Gemm For Fermi Graphics Processing Units , 2010, Int. J. High Perform. Comput. Appl..
[63] José Miguel Mantas,et al. An MPI-CUDA implementation of an improved Roe method for two-layer shallow water systems , 2012, J. Parallel Distributed Comput..
[64] Hee-Seok Kim,et al. A Scalable Tridiagonal Solver for GPUs , 2011, 2011 International Conference on Parallel Processing.
[65] Jack J. Dongarra,et al. Towards dense linear algebra for hybrid GPU accelerated manycore systems , 2009, Parallel Comput..
[66] Zvi Drezner,et al. A New Genetic Algorithm for the Quadratic Assignment Problem , 2003, INFORMS J. Comput..
[67] Tamara G. Kolda,et al. An overview of the Trilinos project , 2005, TOMS.
[68] S. Kaniel. Estimates for Some Computational Techniques - in Linear Algebra , 1966 .
[69] Jack J. Dongarra,et al. Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems , 2011, ICCS.
[70] Timothy A. Davis,et al. Modifying a Sparse Cholesky Factorization , 1999, SIAM J. Matrix Anal. Appl..
[71] Y. Saad,et al. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .
[72] R. Fletcher. Conjugate gradient methods for indefinite systems , 1976 .
[73] Olaf Schenk,et al. Solving unsymmetric sparse systems of linear equations with PARDISO , 2004, Future Gener. Comput. Syst..
[74] R. Eymard,et al. Finite Volume Methods , 2019, Computational Methods for Fluid Dynamics.
[75] Jack J. Dongarra,et al. Overlapping Computation and Communication for Advection on Hybrid Parallel Computers , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[76] Nikos Chrisochoides,et al. Parallel Mesh Generation , 2006 .
[77] Yao Zhang,et al. Parallel Computing Experiences with CUDA , 2008, IEEE Micro.
[78] John K. Reid,et al. The Multifrontal Solution of Indefinite Sparse Symmetric Linear , 1983, TOMS.
[79] Timothy A. Davis,et al. A column pre-ordering strategy for the unsymmetric-pattern multifrontal method , 2004, TOMS.
[80] Jack J. Dongarra,et al. Dense linear algebra solvers for multicore with GPU accelerators , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[81] I. Duff,et al. Direct Methods for Sparse Matrices , 1987 .
[82] M Dorigo,et al. Ant colonies for the quadratic assignment problem , 1999, J. Oper. Res. Soc..
[83] Chao-Tung Yang,et al. Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters , 2011, Comput. Phys. Commun..
[84] Chia-Jung Hsu. Numerical Heat Transfer and Fluid Flow , 1981 .
[85] Tom Shanley,et al. Infiniband Network Architecture , 2002 .
[86] Mark J. Harris,et al. Parallel Prefix Sum (Scan) with CUDA , 2011 .
[87] C. Loan. Computational Frameworks for the Fast Fourier Transform , 1992 .
[88] James Demmel,et al. LU, QR and Cholesky Factorizations using Vector Capabilities of GPUs , 2008 .
[89] Michael T. Heath,et al. Parallel Algorithms for Sparse Linear Systems , 1991, SIAM Rev..
[90] Joseph JáJá,et al. An Optimized FFT-Based Direct Poisson Solver on CUDA GPUs , 2014, IEEE Transactions on Parallel and Distributed Systems.
[91] Roman Wyrzykowski,et al. Parallel Implementation of Conjugate Gradient Method on Graphics Processors , 2009, PPAM.
[92] David Connolly. An improved annealing scheme for the QAP , 1990 .
[93] Bernd Freisleben,et al. Fitness landscape analysis and memetic algorithms for the quadratic assignment problem , 2000, IEEE Trans. Evol. Comput..
[94] Jesús Carretero,et al. Reordering Algorithms for Increasing Locality on Multicore Processors , 2008, 2008 10th IEEE International Conference on High Performance Computing and Communications.
[95] Jack Dongarra,et al. 1. High-Performance Computing , 1998 .
[96] Nathan Ida,et al. Introduction to the Finite Element Method , 1997 .
[97] Chris Thompson,et al. Reducing Communication Overhead in Multi-GPU Hybrid Solver for 2D Laplace’s Equation , 2013, International Journal of Parallel Programming.
[98] B. P. Leonard,et al. The ULTIMATE conservative difference scheme applied to unsteady one-dimensional advection , 1991 .
[99] David E. Bernholdt,et al. A framework for characterizing overlap of communication and computation in parallel applications , 2008, Cluster Computing.
[100] W. Press,et al. Numerical Recipes: The Art of Scientific Computing , 1987 .
[101] David J. Evans,et al. Parallel S.O.R. iterative methods , 1984, Parallel Comput..
[102] Satoshi Matsuoka,et al. Auto-tuning 3-D FFT library for CUDA GPUs , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[103] Yutaka Ishikawa,et al. Optimization of MPI persistent communication , 2013, EuroMPI.
[104] Juan C. Heinrich,et al. The Finite Element Method: Basic Concepts And Applications , 1992 .
[105] Gilbert Laporte,et al. A Combinatorial Optimization Problem Arising in Dartboard Design , 1991 .
[106] C. Lanczos. Solution of Systems of Linear Equations by Minimized Iterations1 , 1952 .
[107] Luc Giraud,et al. A Parallel Distributed Fast 3D Poisson Solver for Méso-NH , 1999, Euro-Par.
[108] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.
[109] Kurt M. Anstreicher,et al. The Steinberg Wiring Problem , 2004, The Sharpest Cut.
[110] Chihiro Iwamura,et al. An efficient algebraic multigrid preconditioned conjugate gradient solver , 2003 .
[111] W. Cheney,et al. Numerical analysis: mathematics of scientific computing (2nd ed) , 1991 .
[112] José M. F. Moura,et al. Algebraic Signal Processing Theory: Cooley–Tukey Type Algorithms for DCTs and DSTs , 2007, IEEE Transactions on Signal Processing.
[113] Phillip Colella,et al. Advanced 3D Poisson solvers and particle-in-cell methods for accelerator modeling , 2005 .
[114] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[115] F. Magoulès,et al. An optimized Schwarz method with two‐sided Robin transmission conditions for the Helmholtz equation , 2007 .
[116] H. V. D. Vorst,et al. The rate of convergence of Conjugate Gradients , 1986 .
[117] Dulcenéia Becker. Parallel unstructured solvers for linear partial differential equations , 2006 .
[118] Thomas Stützle,et al. ACO algorithms for the quadratic assignment problem , 1999 .
[119] Roger W. Hockney,et al. A Fast Direct Solution of Poisson's Equation Using Fourier Analysis , 1965, JACM.
[120] P. Sonneveld. CGS, A Fast Lanczos-Type Solver for Nonsymmetric Linear systems , 1989 .
[121] Kevin Skadron,et al. A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..
[122] Roberto Battiti,et al. The Reactive Tabu Search , 1994, INFORMS J. Comput..
[123] C. Lanczos. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .
[124] Timothy A. Davis,et al. An Unsymmetric-pattern Multifrontal Method for Sparse Lu Factorization , 1993 .
[125] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .
[126] M. Saunders,et al. Solution of Sparse Indefinite Systems of Linear Equations , 1975 .
[127] James Demmel,et al. the Parallel Computing Landscape , 2022 .
[128] Nicholas I. M. Gould,et al. A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations , 2007, TOMS.
[129] Harold S. Stone,et al. An Efficient Parallel Algorithm for the Solution of a Tridiagonal Linear System of Equations , 1973, JACM.
[130] Sébastien Loisel,et al. On the Convergence of Optimized Schwarz Methods by way of Matrix Analysis , 2009 .
[131] Michael J. Flynn,et al. Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.
[132] Bryan Schauer. Multicore Processors - A Necessity , 2008 .
[133] K. Atkinson. Elementary numerical analysis , 1985 .
[134] James Demmel,et al. Applied Numerical Linear Algebra , 1997 .
[135] Torsten Hoefler,et al. Optimizing a conjugate gradient solver with non-blocking collective operations , 2007, Parallel Comput..
[136] Alan H. Karp,et al. Measuring parallel processor performance , 1990, CACM.
[137] Robert Strzodka,et al. Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations , 2007, Int. J. Parallel Emergent Distributed Syst..
[138] R. Dolbeau,et al. HMPP TM : A Hybrid Multi-core Parallel Programming Environment , 2022 .
[139] J. Dongarra,et al. Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems) , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[140] Orion S. Lawlor,et al. Message passing for GPGPU clusters: CudaMPI , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[141] Y. Saad,et al. Iterative solution of linear systems in the 20th century , 2000 .
[142] Jing Wu,et al. Optimized strategies for mapping three-dimensional FFTs onto CUDA GPUs , 2012, 2012 Innovative Parallel Computing (InPar).
[143] Satoshi Matsuoka,et al. Fast Conjugate Gradients with Multiple GPUs , 2009, ICCS.
[144] M. Benzi. Preconditioning techniques for large linear systems: a survey , 2002 .
[145] James Demmel,et al. A Supernodal Approach to Sparse Partial Pivoting , 1999, SIAM J. Matrix Anal. Appl..
[146] Martin J. Gander,et al. Optimized Schwarz Methods , 2006, SIAM J. Numer. Anal..
[147] Jack J. Dongarra,et al. Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy , 2008, TOMS.
[148] Jakob Krarup,et al. Computer-aided layout design , 1978 .
[149] William L. Briggs,et al. A multigrid tutorial, Second Edition , 2000 .
[150] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2009, Parallel Comput..
[151] Jack Dongarra,et al. PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .
[152] Fred W. Glover,et al. Future paths for integer programming and links to artificial intelligence , 1986, Comput. Oper. Res..
[153] Gene H. Golub,et al. Matrix computations , 1983 .
[154] Timothy A. Davis,et al. A combined unifrontal/multifrontal method for unsymmetric sparse matrices , 1999, TOMS.
[155] James Reinders,et al. Intel Xeon Phi Coprocessor High Performance Programming , 2013 .
[156] Michal Czapinski,et al. Tabu Search with two approaches to parallel flowshop evaluation on CUDA platform , 2011, J. Parallel Distributed Comput..
[157] Leon Steinberg,et al. The Backboard Wiring Problem: A Placement Algorithm , 1961 .
[158] Anne Greenbaum,et al. Approximating the inverse of a matrix for use in iterative algorithms on vector processors , 1979, Computing.
[159] Dean G. Duffy,et al. Transform Methods for Solving Partial Differential Equations , 2004 .
[160] Mark Frederick Hoemmen,et al. An Overview of Trilinos , 2003 .
[161] Robert Strzodka,et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..
[162] Weihang Zhu,et al. SIMD tabu search for the quadratic assignment problem with graphics hardware acceleration , 2010 .
[163] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .
[164] Eric Darve,et al. Large calculation of the flow over a hypersonic vehicle using a GPU , 2008, J. Comput. Phys..
[165] Christopher P. Thompson,et al. A Novel, Parallel PDE Solver for Unstructured Grids , 2005, LSSC.
[166] Andreas Koch,et al. A Fast GPU Implementation for Solving Sparse Ill-Posed Linear Equation Systems , 2009, PPAM.
[167] Michael J. Quinn,et al. Parallel programming in C with MPI and OpenMP , 2003 .
[168] L. Giraud,et al. Algebraic Domain Decomposition Preconditioners , 2006 .
[169] Juliane Junker. Finite Elements For Analysis And Design , 2016 .
[170] Rob H. Bisseling,et al. Accelerating a barotropic ocean model using a GPU , 2012 .
[171] Petter E. Bjørstad. Multiplicative And Additive Schwarz' Methods: Convergence In The 2-Domain Case , 1989 .
[172] Fred W. Glover,et al. Multistart Tabu Search and Diversification Strategies for the Quadratic Assignment Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.
[173] Christina Freytag,et al. Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .
[174] Fábio Henrique Pereira,et al. A fast algebraic multigrid preconditioned conjugate gradient solver , 2006, Appl. Math. Comput..
[175] Rajeev Thakur,et al. Test suite for evaluating performance of multithreaded MPI communication , 2009, Parallel Comput..
[176] R. Temam,et al. Navier-Stokes equations: theory and numerical analysis: R. Teman North-Holland, Amsterdam and New York. 1977. 454 pp. US $45.00 , 1978 .
[177] Christoph W. Kessler,et al. Practical PRAM programming , 2000, Wiley series on parallel and distributed computing.
[178] John W. Dickey,et al. Campus building arrangement using topaz , 1972 .
[179] Hui Wu,et al. Parallelizing SOR for GPGPUs using alternate loop tiling , 2012, Parallel Comput..
[180] Jens H. Krüger,et al. A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.
[181] J. Demmel,et al. Sun Microsystems , 1996 .
[182] F. Browder,et al. Partial Differential Equations in the 20th Century , 1998 .
[183] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[184] Rohit Chandra,et al. Parallel programming in openMP , 2000 .
[185] George Havas,et al. On the worst-case complexity of integer Gaussian elimination , 1997, ISSAC.
[186] Antoine Petitet,et al. Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005 .
[187] D FalgoutRobert. An Introduction to Algebraic Multigrid , 2006 .
[188] Zhen Wang,et al. Block-Relaxation Methods for 3D Constant-Coefficient Stencils on GPUs and Multicore CPUs , 2012, ArXiv.
[189] Kevin Skadron,et al. Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[190] R. Freund,et al. QMR: a quasi-minimal residual method for non-Hermitian linear systems , 1991 .
[191] Richard Barrett,et al. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.
[192] Sven Rahmann,et al. Microarray Layout as Quadratic Assignment Problem , 2006, German Conference on Bioinformatics.
[193] Chenhan D. Yu,et al. A CPU-GPU hybrid approach for the unsymmetric multifrontal method , 2011, Parallel Comput..
[194] Ninghui Sun,et al. SMAT: an input adaptive auto-tuner for sparse matrix-vector multiplication , 2013, PLDI.
[195] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[196] Weeratunge Malalasekera,et al. An introduction to computational fluid dynamics - the finite volume method , 2007 .
[197] David H. Bailey,et al. The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).
[198] D. Young. Iterative methods for solving partial difference equations of elliptic type , 1954 .
[199] Timothy A. Davis,et al. Multiple-Rank Modifications of a Sparse Cholesky Factorization , 2000, SIAM J. Matrix Anal. Appl..
[200] Patrick M. Knupp,et al. Fundamentals of Grid Generation , 2020 .
[201] Franz Rendl,et al. QAPLIB – A Quadratic Assignment Problem Library , 1997, J. Glob. Optim..
[202] Anamitra R. Choudhury,et al. Multifrontal Factorization of Sparse SPD Matrices on GPUs , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[203] Henk A. van der Vorst,et al. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..
[204] Gene Poole,et al. Accelerating the ANSYS Direct Sparse Solver with GPUs , 2011 .
[205] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[206] Jitendra Malik,et al. Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[207] Timothy A. Davis,et al. Row Modifications of a Sparse Cholesky Factorization , 2005, SIAM J. Matrix Anal. Appl..
[208] YANQING CHEN,et al. Algorithm 8 xx : CHOLMOD , supernodal sparse Cholesky factorization and update / downdate ∗ , 2006 .
[209] J. Grcar. How ordinary elimination became Gaussian elimination , 2009, 0907.2397.
[210] W. Arnoldi. The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .
[211] Sathish S. Vadhiyar,et al. An efficient MPI_allgather for grids , 2007, HPDC '07.
[212] Thomas Stützle,et al. Iterated local search for the quadratic assignment problem , 2006, Eur. J. Oper. Res..
[213] Rajesh Bordawekar,et al. Optimizing Sparse Matrix-Vector Multiplication on GPUs , 2009 .
[214] Louis A. Hageman,et al. Iterative Solution of Large Linear Systems. , 1971 .
[215] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .