Oncilla: A GAS runtime for efficient resource allocation and data movement in accelerated clusters
暂无分享,去创建一个
Holger Fröning | Karsten Schwan | Sudhakar Yalamanchili | Jeffrey S. Young | Alex Merritt | Se Hoon Shon | S. Yalamanchili | H. Fröning | K. Schwan | A. Merritt
[1] Bingsheng He,et al. Relational query coprocessing on graphics processors , 2009, TODS.
[2] Courtenay T. Vaughan,et al. Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application Codes , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[3] Kunle Olukotun,et al. Accelerating CUDA graph algorithms at maximum warp , 2011, PPoPP '11.
[4] Michael Garland,et al. Designing a unified programming model for heterogeneous machines , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[5] Dan Bonachea. GASNet Specification, v1.1 , 2002 .
[6] Wei Jiang,et al. Scheduling Concurrent Applications on a Cluster of CPU-GPU Nodes , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[7] Tong Liu,et al. The development of Mellanox/NVIDIA GPUDirect over InfiniBand—a new model for GPU to GPU communications , 2011, Computer Science - Research and Development.
[8] Dhabaleswar K. Panda,et al. High performance RDMA-based design of HDFS over InfiniBand , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[9] Sudhakar Yalamanchili,et al. Relational algorithms for multi-bulk-synchronous processors , 2013, PPoPP '13.
[10] Massimo Bernaschi,et al. Breadth First Search on APEnet+ , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[11] Massimo Bernaschi,et al. Benchmarking of communication techniques for GPUs , 2013, J. Parallel Distributed Comput..
[12] Tetsu Narumi,et al. DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[13] Ulrich Brüning,et al. A Resource Optimized Remote-Memory-Access Architecture for Low-latency Communication , 2009, 2009 International Conference on Parallel Processing.
[14] Holger Fröning,et al. GGAS: Global GPU address spaces for efficient communication in heterogeneous clusters , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).
[15] Holger Fröning,et al. Efficient hardware support for the Partitioned Global Address Space , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[16] Holger Fröning,et al. On Achieving High Message Rates , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.
[17] Federico Silla,et al. Enabling CUDA acceleration within virtual machines using rCUDA , 2011, 2011 18th International Conference on High Performance Computing.
[18] Collin McCurdy,et al. The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.
[19] Mikyung Kang,et al. Heterogeneous Cloud Computing , 2011, 2011 IEEE International Conference on Cluster Computing.
[20] Parag Agrawal,et al. The case for RAMCloud , 2011, Commun. ACM.
[21] Werner Vogels,et al. Eventually consistent , 2008, CACM.
[22] Holger Fröning,et al. MEMSCALE™: A Scalable Environment for Databases , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.
[23] Andrew A. Chien,et al. A software architecture for global address space communication on clusters: put/get on fast messages , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).
[24] Sudhakar Yalamanchili,et al. Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[25] Sudhakar Yalamanchili,et al. Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[26] Carlos Reaño,et al. CU2rCU: Towards the complete rCUDA remote GPU virtualization and sharing solution , 2012, 2012 19th International Conference on High Performance Computing.
[27] Vishakha Gupta,et al. Shadowfax: scaling in heterogeneous cluster systems via GPGPU assemblies , 2011, VTDC '11.
[28] Brian W. Barrett,et al. Introducing the Graph 500 , 2010 .