Auto-Tuning Complex Array Layouts for GPUs
暂无分享,去创建一个
[1] Anjul Patney,et al. Real-time Reyes-style adaptive surface subdivision , 2008, SIGGRAPH Asia '08.
[2] Richard W. Vuduc,et al. Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010, PPoPP '10.
[3] P. Hanrahan,et al. Sequoia: Programming the Memory Hierarchy , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[4] S. Popov,et al. Experiences with Streaming Construction of SAH KD-Trees , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.
[5] Ingo Wald,et al. Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture , 2012, IEEE Transactions on Visualization and Computer Graphics.
[6] Michael C. Doggett,et al. Auto-tuning interactive ray tracing using an analytical GPU architecture model , 2012, GPGPU-5.
[7] Kun Zhou,et al. RenderAnts: interactive Reyes rendering on GPUs , 2009, SIGGRAPH 2009.
[8] Liqiang Wang,et al. Auto-Tuning CUDA Parameters for Sparse Matrix-Vector Multiplication on GPUs , 2010, 2010 International Conference on Computational and Information Sciences.
[9] He Huang,et al. A model-driven partitioning and auto-tuning integrated framework for sparse matrix-vector multiplication on GPUs , 2011 .
[10] Kunle Olukotun,et al. Green-Marl: a DSL for easy and efficient graph analysis , 2012, ASPLOS XVII.
[11] David D. Cox,et al. Machine learning for predictive auto-tuning with boosted regression trees , 2012, 2012 Innovative Parallel Computing (InPar).
[12] William Gropp,et al. An adaptive performance modeling tool for GPU architectures , 2010, PPoPP '10.
[13] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.
[14] Kenneth E. Batcher,et al. Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.
[15] Mary Hall,et al. Autotuning, code generation and optimizing compiler technology for gpus , 2012 .
[16] Hans Henrik Brandenborg Sørensen,et al. Auto-tuning Dense Vector and Matrix-Vector Operations for Fermi GPUs , 2011, PPAM.
[17] Kun Zhou,et al. RenderAnts: interactive Reyes rendering on GPUs , 2009, SIGGRAPH 2009.
[18] Chun Chen,et al. A Programming Language Interface to Describe Transformations and Code Generation , 2010, LCPC.
[19] Kurt Keutzer,et al. Copperhead: compiling an embedded data parallel language , 2011, PPoPP '11.
[20] Robert L. Cook,et al. The Reyes image rendering architecture , 1987, SIGGRAPH.
[21] Dana Schaa,et al. Modeling execution and predicting performance in multi-GPU environments , 2009 .
[22] Tarek S. Abdelrahman,et al. hiCUDA: a high-level directive-based language for GPU programming , 2009, GPGPU-2.
[23] Anjul Patney,et al. Real-time Reyes-style adaptive surface subdivision , 2008, SIGGRAPH 2008.
[24] Frank Mueller,et al. Autogeneration and Autotuning of 3D Stencil Codes on Homogeneous and Heterogeneous GPU Clusters , 2013, IEEE Transactions on Parallel and Distributed Systems.
[25] Jan Vitek,et al. Terra: a multi-stage language for high-performance computing , 2013, PLDI.