Performance evaluation and optimization of random memory access on multicores with high productivity
暂无分享,去创建一个
[1] Jack Dongarra,et al. Introduction to the HPCChallenge Benchmark Suite , 2004 .
[2] Philip Heidelberger,et al. HPCC RandomAccess benchmark for next generation supercomputers , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[3] Yogish Sabharwal,et al. Software Routing and Aggregation of Messages to Optimize the Performance of HPCC Randomaccess Benchmark , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[4] Tao Zhang,et al. Prefetching irregular references for software cache on cell , 2008, CGO '08.
[5] Rodney A. Kennedy,et al. Efficient Histogram Algorithms for NVIDIA CUDA Compatible Devices , 2007 .
[6] Courtenay T. Vaughan,et al. A Simple Synchronous Distributed-Memory Algorithm for the HPCC RandomAccess Benchmark , 2006, 2006 IEEE International Conference on Cluster Computing.
[7] I. Wald,et al. Ray Tracing on the Cell Processor , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.
[8] J. Hornegger,et al. Fast GPU-Based CT Reconstruction using the Common Unified Device Architecture (CUDA) , 2007, 2007 IEEE Nuclear Science Symposium Conference Record.
[9] Jason N. Dale,et al. Cell Broadband Engine Architecture and its first implementation - A performance view , 2007, IBM J. Res. Dev..
[10] Eduard Ayguadé,et al. A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor , 2007, LCPC.