Convergence and Scalarization in Whole Function Vectorization
暂无分享,去创建一个
[1] Sam S. Stone,et al. MCUDA: An Efficient Implementation of CUDA Kernels on Multi-cores , 2011 .
[2] Sudhakar Yalamanchili,et al. Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[3] Sebastian Hack,et al. Whole-function vectorization , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[4] Fernando Magno Quintão Pereira,et al. Divergence Analysis and Optimizations , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[5] Mike Murphy,et al. Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs , 2010, CGO '10.
[6] Krste Asanovic,et al. Convergence and scalarization for data-parallel architectures , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).