Unstructured computational aerodynamics on many integrated core architecture
暂无分享,去创建一个
[1] David E. Keyes,et al. Pseudotransient Continuation and Differential-Algebraic Equations , 2003, SIAM J. Sci. Comput..
[2] William Gropp,et al. Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD , 2000, Int. J. High Perform. Comput. Appl..
[3] Alexander Heinecke,et al. Towards High-Performance Optimizations of the Unstructured Open-Source SU2 Suite , 2015 .
[4] Fan Ye,et al. The Exploration of Pervasive and Fine-Grained Parallel Model Applied on Intel Xeon Phi Coprocessor , 2013, 2013 Eighth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.
[5] David E. Keyes,et al. Hybrid Programming Model for Implicit PDE Simulations on Multicore Architectures , 2011, IWOMP.
[6] Paul H. J. Kelly,et al. Performance analysis of the OP2 framework on many-core architectures , 2011, PERV.
[7] W. K. Anderson,et al. Implicit/Multigrid Algorithms for Incompressible Turbulent Flows on Unstructured Grids , 1995 .
[8] Vipin Kumar,et al. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..
[9] Pradeep Dubey,et al. Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[10] David E. Keyes,et al. Prospects for CFD on Petaflops Systems , 1997 .
[11] W. K. Anderson,et al. An implicit upwind algorithm for computing turbulent flows on unstructured grids , 1994 .
[12] Rezaur Rahman,et al. Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers , 2013 .
[13] Stephen A. Jarvis,et al. Exploring SIMD for Molecular Dynamics , 2013 .
[14] Gihan R. Mudalige,et al. Vectorizing unstructured mesh computations for many‐core architectures , 2016, Concurr. Comput. Pract. Exp..
[15] Xinmin Tian,et al. Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[16] Jianbin Fang,et al. An Empirical Study of Intel Xeon Phi , 2013, ArXiv.
[17] Stephen A. Jarvis,et al. Exploring SIMD for Molecular Dynamics, Using Intel® Xeon® Processors and Intel® Xeon Phi Coprocessors , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[18] D. Keyes,et al. Jacobian-free Newton-Krylov methods: a survey of approaches and applications , 2004 .
[19] Ewing L. Lusk,et al. Early Experiments with the OpenMP/MPI Hybrid Programming Model , 2008, IWOMP.
[20] Ravi Narayanaswamy,et al. Offload Compiler Runtime for the Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[21] Geoffrey C. Fox,et al. Fortran 90D/HPF compiler for distributed memory MIMD computers: design, implementation, and performance results , 1993, Supercomputing '93.
[22] Graph Topology. MPI at Exascale , 2010 .
[23] C. Kelley,et al. Convergence Analysis of Pseudo-Transient Continuation , 1998 .
[24] Jesper Larsson Träff,et al. MPI on a Million Processors , 2009, PVM/MPI.
[25] Georg Hager,et al. Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[26] E. Cuthill,et al. Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.
[27] Emre Kultursay,et al. Compiler-Based Data Prefetching and Streaming Non-temporal Store Generation for the Intel(R) Xeon Phi(TM) Coprocessor , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[28] Y. Saad,et al. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems , 1986 .
[29] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[30] Eric J. Nielsen,et al. Production Level CFD Code Acceleration for Hybrid Many-Core Architectures , 2012 .
[31] Ravi Narayanaswamy,et al. Offload Compiler Runtime for the Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[32] Yuzhong Shen,et al. Energy Evaluation for Applications with Different Thread Affinities on the Intel Xeon Phi , 2014, 2014 International Symposium on Computer Architecture and High Performance Computing Workshop.
[33] Sanjukta Bhowmick,et al. Parallel adaptive solvers in compressible petsc-fun3d simulations , 2006 .
[34] Qing Zhang,et al. High-Performance Computing on the Intel® Xeon Phi™ , 2014, Springer International Publishing.
[35] Luca Faust,et al. Modern Operating Systems , 2016 .
[36] W. K. Anderson,et al. Achieving High Sustained Performance in an Unstructured Mesh CFD Application , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[37] Michael Klemm,et al. OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison , 2012, MARC@RWTH.
[38] David A. Patterson,et al. Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .
[39] C. Kelley,et al. Pseudo-transient continuation and differential-algebraic equations , 2002 .
[40] Nan Wu,et al. Utilizing Multiple Xeon Phi Coprocessors on One Compute Node , 2014, ICA3PP.
[41] D. Birchall,et al. Computational Fluid Dynamics , 2020, Radial Flow Turbocompressors.
[42] James Reinders,et al. Intel Xeon Phi Coprocessor High Performance Programming , 2013 .
[43] Ümit V. Çatalyürek,et al. Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi , 2013, PPAM.
[44] William Gropp,et al. Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries , 1997, SciTools.
[45] Guillaume Houzeaux,et al. Some useful strategies for unstructured edge‐based solvers on shared memory machines , 2011 .
[46] Kai Li,et al. Full correlation matrix analysis of fMRI data on Intel® Xeon Phi™ coprocessors , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[47] Sabela Ramos,et al. Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi , 2013, HPDC.
[48] William Gropp,et al. High-performance parallel implicit CFD , 2001, Parallel Comput..