42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence
暂无分享,去创建一个
Makoto Taiji | Rio Yokota | Tsuyoshi Hamada | Tetsu Narumi | Kenji Yasuoka | Keigo Nitadori | T. Hamada | M. Taiji | K. Yasuoka | T. Narumi | Rio Yokota | K. Nitadori
[1] Toshiyuki Fukushige,et al. A 29.5 Tflops Simulation of Planetesimals in Uranus-Neptune Region on GRAPE-6 , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[2] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.
[3] Thomas Sterling,et al. Pentium Pro inside. 1; A treecode at 430 Gigaflops on ASCI Red , 1997 .
[4] Toshiyuki Fukushige,et al. Performance evaluation and tuning of GRAPE-6 - towards 40 "real" Tflops , 2003, SC.
[5] Atsushi Kawai,et al. $7.0/Mflops Astrophysical N-Body Simulation with Treecode on GRAPE-5 , 1999, SC.
[6] Shinnosuke Obi,et al. Calculation of isotropic turbulence using a pure Lagrangian vortex method , 2007, J. Comput. Phys..
[7] J. Monaghan,et al. Smoothed particle hydrodynamics: Theory and application to non-spherical stars , 1977 .
[8] Masaki Koga,et al. A 1.349 Tflops simulation of black holes in a galactic center on GRAPE-6 , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[9] Alan H. Karp. Speeding up N-body Calculations on Machines without Hardware Square Root , 1992, Sci. Program..
[10] Leslie Greengard,et al. A fast algorithm for particle simulations , 1987 .
[11] Mark J. Stock,et al. Toward efficient GPU-accelerated N-body simulations , 2008 .
[12] Makoto Taiji,et al. Astrophysical N-body simulations on the GRAPE-4 Special-Purpose Computer , 1995, SC.
[13] Joshua E. Barnes,et al. A modified tree code: don't laugh; it runs , 1990 .
[14] Jun Makino,et al. Performance and accuracy of a GRAPE‐3 system for collisionless N‐body simulations , 1998 .
[15] Tsuyoshi Hamada,et al. The Chamomile Scheme: An Optimized Algorithm for N-body simulations on Programmable Graphics Processing Units , 2007 .
[16] Thomas L. Sterling,et al. Pentium Pro Inside: I. A Treecode at 430 Gigaflops on ASCI Red, II. Price/Performance of $50/Mflop on Loki and Hyglac , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[17] Junichiro Makino,et al. A Fast Parallel Treecode with GRAPE , 2004 .
[18] Tomonari Masada,et al. A novel multiple-walk parallel algorithm for the Barnes–Hut treecode on GPUs – towards cost effective, high performance N-body simulation , 2009, Computer Science - Research and Development.
[19] Yao Zhang,et al. Scan primitives for GPU computing , 2007, GH '07.
[20] Junichiro Makino,et al. Performance Tuning of N-Body Codes on Modern Microprocessors: I. Direct Integration with a Hermite Scheme on x86_64 Architecture , 2006 .
[21] R. Rogallo. Numerical experiments in homogeneous turbulence , 1981 .
[22] Ramani Duraiswami,et al. Fast multipole methods on graphics processors , 2008, J. Comput. Phys..
[23] Petros Koumoutsakos,et al. Vortex Methods: Theory and Practice , 2000 .
[24] Simon Portegies Zwart,et al. SAPPORO: A way to turn your graphics cards into a GRAPE-6 , 2009, ArXiv.
[25] Toshiyuki Fukushige,et al. N-Boday Simulation of Galaxy Formation on GRAPE-4 Special-Purpose Computer , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[26] Atsushi Kawai,et al. $158/GFLOPS astrophysical N-body simulation with reconfigurable add-in card and hierarchical tree algorithm , 2006, SC.
[27] Michael S. Warren,et al. Astrophysical N-body simulations using hierarchical tree data structures , 1992, Proceedings Supercomputing '92.
[28] David M. Beazley,et al. Avalon: an Alpha/Linux cluster achieves 10 Gflops for $15k , 1998, SC '98.
[29] Ryutaro Himeno,et al. A 55 TFLOPS simulation of amyloid-forming peptides from yeast prion Sup35 with the special-purpose computer system MDGRAPE-3 , 2006, SC.
[30] Robert G. Belleman,et al. High Performance Direct Gravitational N-body Simulations on Graphics Processing Units , 2007, ArXiv.
[31] Junichiro Makino,et al. Treecode with a Special-Purpose Processor , 1991 .