Power/Performance Trade-Offs of Small Batched LU Based Solvers on GPUs
暂无分享,去创建一个
Massimiliano Fatica | Antonino Tumeo | Oreste Villa | Nitin Gawande | Antonino Tumeo | M. Fatica | Oreste Villa | N. Gawande
[1] Karsten Pruess,et al. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code , 2008 .
[2] Martinus Oostrom,et al. STOMP Subsurface Transport Over Multiple Phases, Version 4.0, User’s Guide , 2006 .
[3] Rubén Cañedo Andalia,et al. Bentham Science Publishers , 2008 .
[4] Nicholas J. Higham,et al. Gaussian elimination , 2011, Introduction to Finite Elements in Engineering.
[5] Emmanuel Agullo,et al. LU factorization for accelerator-based systems , 2011, 2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA).
[6] Chuan Lu,et al. PFLOTRAN: Reactive Flow & Transport Code for Use on Laptops to Leadership-Class Supercomputers , 2012 .
[7] Jack J. Dongarra,et al. Dense linear algebra solvers for multicore with GPU accelerators , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[8] Jack J. Dongarra,et al. Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems , 2012, ICS '12.
[9] Fan Zhang,et al. Application of a hybrid MPI/OpenMP approach for parallel groundwater model calibration using multi-core computers , 2010, Comput. Geosci..