Chip‐level and multi‐node analysis of energy‐optimized lattice Boltzmann CFD simulations
暂无分享,去创建一个
[1] Robert Schöne,et al. Memory Performance at Reduced CPU Clock Speeds: An Analysis of Current x86_64 Processors , 2012, HotPower.
[2] Arndt Bode. Energy to Solution: A New Mission for Parallel Computing , 2013, Euro-Par.
[3] Leonid Oliker,et al. Magnetohydrodynamic Turbulence Simulations on the Earth Simulator Using the Lattice Boltzmann Method , 2005 .
[4] Gerhard Wellein,et al. Benchmark Analysis and Application Results for Lattice Boltzmann Simulations on NEC SX Vector and Intel Nehalem Systems , 2009, Parallel Process. Lett..
[5] Gerhard Wellein,et al. Pushing the limits for medical image reconstruction on recent standard multicore processors , 2011, Int. J. High Perform. Comput. Appl..
[6] Richard W. Vuduc,et al. A Roofline Model of Energy , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[7] Gerhard Wellein,et al. Comparison of different propagation steps for lattice Boltzmann methods , 2011, Comput. Math. Appl..
[8] Cass T. Miller,et al. A high-performance lattice Boltzmann implementation to model flow in porous media , 2003 .
[9] Gerhard Wellein,et al. Leveraging Shared Caches for Parallel Temporal Blocking of Stencil Codes on Multicore Processors and Clusters , 2010, Parallel Process. Lett..
[10] François Bertrand,et al. On improving the performance of large parallel lattice Boltzmann flow simulations in heterogeneous porous media , 2010 .
[11] S. Roller,et al. A fully distributed CFD framework for massively parallel systems , 2012 .
[12] Gerhard Wellein,et al. On the single processor performance of simple lattice Boltzmann kernels , 2006 .
[13] Yale N. Patt,et al. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs , 2008, ASPLOS.
[14] Massimo Bernaschi,et al. MUPHY: A parallel high performance MUlti PHYsics/Scale code , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[15] Thomas Zeiser,et al. Performance evaluation of a parallel sparse lattice Boltzmann solver , 2008, J. Comput. Phys..
[16] Stephen W. Poole,et al. Towards efficient supercomputing: searching for the right efficiency metric , 2012, ICPE '12.
[17] D. d'Humières,et al. Two-relaxation-time Lattice Boltzmann scheme: About parametrization, velocity, pressure and mixed boundary conditions , 2008 .
[18] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[19] Ernst Rank,et al. Parallelization Strategies and Efficiency of CFD Computations in Complex Geometries Using Lattice Boltzmann Methods on High-Performance Computers , 2002 .
[20] No License,et al. Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .
[21] Xiaoxian Zhang,et al. Domain-decomposition method for parallel lattice Boltzmann simulation of incompressible flow in porous media. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.
[22] Dong Li,et al. Strategies for Energy-Efficient Resource Management of Hybrid Programming Models , 2013, IEEE Transactions on Parallel and Distributed Systems.
[23] Peter Bailey,et al. Accelerating Lattice Boltzmann Fluid Flow Simulations Using Graphics Processors , 2009, 2009 International Conference on Parallel Processing.
[24] Massimo Bernaschi,et al. Multiscale Simulation of Cardiovascular flows on the IBM Bluegene/P: Full Heart-Circulation System at Red-Blood Cell Resolution , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[25] Gerhard Wellein,et al. Exploring performance and power properties of modern multi‐core chips via simple machine models , 2012, Concurr. Comput. Pract. Exp..
[26] Tuomo Rossi,et al. Comparison of implementations of the lattice-Boltzmann method , 2008, Comput. Math. Appl..
[27] Efraim Rotem,et al. Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge , 2012, IEEE Micro.
[28] Georg Hager,et al. Introducing a Performance Model for Bandwidth-Limited Loop Kernels , 2009, PPAM.
[29] Samuel Williams,et al. Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms , 2009, J. Parallel Distributed Comput..
[30] Constantine Bekas,et al. A new energy aware performance metric , 2010, Computer Science - Research and Development.