Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems
暂无分享,去创建一个
John Shalf | Pradeep Dubey | Horst D. Simon | Jongsoo Park | Changkyu Kim | Jatin Chhugani | Hemant Shukla | H. Simon | P. Dubey | Jongsoo Park | J. Shalf | J. Chhugani | Changkyu Kim | H. Shukla
[1] Robert H. Halstead,et al. Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, LISP and Functional Programming.
[2] Robert J. Brunner,et al. Accelerating Cosmological Data Analysis with FPGAs , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.
[3] A. Kashlinsky,et al. Large-scale structure in the Universe , 1991, Nature.
[4] David A. Bader,et al. Practical parallel algorithms for dynamic data redistribution, median finding, and selection , 1995, Proceedings of International Conference on Parallel Processing.
[5] Christopher J. Hughes,et al. Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.
[6] J. Cordes. The Square Kilometer Array , 2006 .
[7] John C. Hart,et al. Parallel SAH k-D tree construction , 2010, HPG '10.
[8] Kirk D. Borne,et al. Galaxy Evolution with LSST , 2010 .
[9] Robert J. Brunner,et al. Accelerating cosmological data analysis with graphics processors , 2009, GPGPU-2.
[10] Pradeep Dubey,et al. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs , 2009, Proc. VLDB Endow..
[11] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.
[12] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2011, Commun. ACM.
[13] Edward T. Grochowski,et al. Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).
[14] Edward J. Wollack,et al. SEVEN-YEAR WILKINSON MICROWAVE ANISOTROPY PROBE (WMAP) OBSERVATIONS: PLANETS AND CELESTIAL CALIBRATION SOURCES , 2010, 1001.4731.
[15] Tsuyoshi Hamada,et al. 190 TFlops Astrophysical N-body Simulation on a Cluster of GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[16] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.
[17] Robert J. Brunner,et al. Implementation of the two-point angular correlation function on a high-performance reconfigurable computer , 2009, Sci. Program..
[18] L. Wasserman,et al. Fast Algorithms and Efficient Statistics: N-Point Correlation Functions , 2000, astro-ph/0012333.
[19] J. Koomey. Worldwide electricity used in data centers , 2008 .
[20] Andrew W. Moore,et al. 'N-Body' Problems in Statistical Learning , 2000, NIPS.
[21] Christopher J. Hughes,et al. Atomic Vector Operations on Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.
[22] Christopher J. Hughes,et al. Computer Vision on Multi-Core Processors: Articulated Body Tracking , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[23] Joseph Lazio. The Square Kilometer Array , 2008 .
[24] Ray P. Norris. Data Challenges for Next-generation Radio Telescopes , 2010, 2010 Sixth IEEE International Conference on e-Science Workshops.
[25] Jon Louis Bentley,et al. Multidimensional binary search trees used for associative searching , 1975, CACM.
[26] Pradeep Dubey,et al. Designing and dynamically load balancing hybrid LU for multi/many-core , 2011, Computer Science - Research and Development.
[27] A. Szalay,et al. Bias and variance of angular correlation functions , 1993 .
[28] Eftychios Sifakis,et al. Physical simulation for animation and visual effects: parallelization and characterization for chip multiprocessors , 2007, ISCA '07.
[29] Robert J. Brunner,et al. Fast Two-Point Correlations of Extremely Large Data Sets , 2008 .
[30] Pradeep Dubey,et al. Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort , 2010, SIGMOD Conference.
[31] Sriram Krishnamoorthy,et al. Scalable work stealing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[32] Laxmikant V. Kalé,et al. Enabling and scaling biomolecular simulations of 100 million atoms on petascale machines with a multicore-optimized message-driven runtime , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[33] P. Peebles,et al. The Cosmological Constant and Dark Energy , 2002, astro-ph/0207347.
[34] Albert-Jan Boonstra,et al. DOME: towards the ASTRON & IBM center for exascale technology , 2012, Astro-HPC '12.