Algorithmic and Software System Support to Accelerate Data Processing in CPU-GPU Hybrid Computing Environments
暂无分享,去创建一个
[1] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..
[2] Xiaoning Ding,et al. BWS: balanced work stealing for time-sharing multicores , 2012, EuroSys '12.
[3] Jun Kong,et al. A data model and database for high-resolution pathology analytical image informatics , 2011, Journal of pathology informatics.
[4] Vanish Talwar,et al. Pegasus: Coordinated Scheduling for Virtualized Accelerator-based Systems , 2011, USENIX Annual Technical Conference.
[5] Pradeep Dubey,et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.
[6] Anastasia Ailamaki,et al. QPipe: a simultaneously pipelined relational query engine , 2005, SIGMOD '05.
[7] Juan Pineda,et al. A parallel algorithm for polygon rasterization , 1988, SIGGRAPH.
[8] Jun Kong,et al. Integrated morphologic analysis for the identification and characterization of disease subtypes , 2012, J. Am. Medical Informatics Assoc..
[9] Bingsheng He,et al. High-Throughput Transaction Executions on Graphics Processors , 2011, Proc. VLDB Endow..
[10] Daniel T. Larose,et al. Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .
[11] Dawson R. Engler,et al. Exterminate all operating system abstractions , 1995, Proceedings 5th Workshop on Hot Topics in Operating Systems (HotOS-V).
[12] Sudhakar Yalamanchili,et al. Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[13] Aaftab Munshi,et al. The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).
[14] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[15] Seungyeop Han,et al. SSLShader: Cheap SSL Acceleration with Commodity Processors , 2011, NSDI.
[16] Peter Benjamin Volk,et al. GPU join processing revisited , 2012, DaMoN '12.
[17] James Demmel,et al. the Parallel Computing Landscape , 2022 .
[18] Yuan Yuan,et al. The Yin and Yang of Processing Data Warehousing Queries on GPU Devices , 2013, Proc. VLDB Endow..
[19] Hyesoon Kim. Supporting virtual memory in GPGPU without supporting precise exceptions , 2012, MSPC '12.
[20] Bingsheng He,et al. Relational joins on graphics processors , 2008, SIGMOD Conference.
[21] Peter J. Denning,et al. Third Generation Computer Systems , 1971, CSUR.
[22] John E. Stone,et al. An asymmetric distributed shared memory model for heterogeneous parallel systems , 2010, ASPLOS XV.
[23] Feng Ji,et al. RSVM: A Region-based Software Virtual Memory for GPU , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[24] Divyakant Agrawal,et al. Hardware Acceleration in Commercial Databases: A Case Study of Spatial Operations , 2004, VLDB.
[25] Christos Faloutsos,et al. Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.
[26] Mark de Berg,et al. Computational geometry: algorithms and applications , 1997 .
[27] Peter J. Denning,et al. Virtual memory , 1970, CSUR.
[28] Dinesh Manocha,et al. Fast computation of database operations using graphics processors , 2005, SIGGRAPH Courses.
[29] Sebastian Breß,et al. Why it is time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS , 2013, Proc. VLDB Endow..
[30] Mohan S. Kankanhalli,et al. Calculating the Area of Overlaid Polygons Without Constructing the Overlay , 1994 .
[31] Ronald L. Wasserstein,et al. Monte Carlo: Concepts, Algorithms, and Applications , 1997 .
[32] John Poulton. An embedded DRAM for CMOS ASICs , 1997, Proceedings Seventeenth Conference on Advanced Research in VLSI.
[33] Joseph O'Rourke,et al. Computational Geometry in C. , 1995 .
[34] Idit Keidar,et al. GPUfs: Integrating a file system with GPUs , 2013, TOCS.
[35] Shinpei Kato,et al. TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments , 2011, USENIX Annual Technical Conference.
[36] Martin L. Kersten,et al. The researcher's guide to the data deluge , 2011, Proc. VLDB Endow..
[37] David J. DeWitt,et al. Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.
[38] Gang Wang,et al. Efficient Parallel Lists Intersection and Index Compression Algorithms using Graphics Processing Units , 2011, Proc. VLDB Endow..
[39] Lei Jiang,et al. Die Stacking (3D) Microarchitecture , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[40] Michael R. Macedonia,et al. The GPU Enters Computing's Mainstream , 2003, Computer.
[41] Shinpei Kato,et al. GDM: device memory management for gpgpu computing , 2014, SIGMETRICS '14.
[42] Fusheng Wang,et al. YSmart: Yet Another SQL-to-MapReduce Translator , 2011, 2011 31st International Conference on Distributed Computing Systems.
[43] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.
[44] Karthikeyan Sankaralingam,et al. iGPU: Exception support and speculative execution on GPUs , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[45] Pradeep Dubey,et al. Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort , 2010, SIGMOD Conference.
[46] Alexey Kukanov,et al. The Foundations for Scalable Multicore Software in Intel Threading Building Blocks , 2007 .
[47] Pradeep Dubey,et al. FAST: fast architecture sensitive tree search on modern CPUs and GPUs , 2010, SIGMOD Conference.
[48] M. Berger,et al. Adaptive mesh refinement for hyperbolic partial differential equations , 1982 .
[49] Onur Mutlu,et al. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.
[50] Magdalena Balazinska,et al. Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help? , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.
[51] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[52] Dinesh Manocha,et al. GPUTeraSort: high performance graphics co-processor sorting for large database management , 2006, SIGMOD Conference.
[53] Volker Markl,et al. A First Step Towards GPU-assisted Query Optimization , 2012, ADMS@VLDB.
[54] Volker Markl,et al. Hardware-Oblivious Parallelism for In-Memory Column-Stores , 2013, Proc. VLDB Endow..
[55] Martin L. Kersten,et al. Waste not… Efficient co-processing of relational data , 2014, 2014 IEEE 30th International Conference on Data Engineering.
[56] Shinpei Kato,et al. Gdev: First-Class GPU Resource Management in the Operating System , 2012, USENIX Annual Technical Conference.
[57] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[58] Kenneth A. Ross,et al. Ameliorating memory contention of OLAP operators on GPU processors , 2012, DaMoN '12.
[59] A. Snavely,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[60] Divyakant Agrawal,et al. Hardware acceleration for spatial selections and joins , 2003, SIGMOD '03.
[61] Mark Silberstein,et al. PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.
[62] Martin L. Kersten,et al. Accelerating Foreign-Key Joins using Asymmetric Memory Channels , 2011, ADMS@VLDB.
[63] David I. August,et al. Automatic CPU-GPU communication management and optimization , 2011, PLDI '11.
[64] Joel H. Saltz,et al. Accelerating Pathology Image Data Cross-Comparison on CPU-GPU Hybrid Systems , 2012, Proc. VLDB Endow..
[65] Yao Zhang,et al. A quantitative performance analysis model for GPU architectures , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.