暂无分享,去创建一个
[1] Maged M. Michael. Scalable lock-free dynamic memory allocation , 2004, PLDI '04.
[2] Ulf Assarsson,et al. Efficient stream compaction on wide SIMD many-core architectures , 2009, High Performance Graphics.
[3] Michael Goesele,et al. Fast dynamic memory allocator for massively parallel architectures , 2013, GPGPU@ASPLOS.
[4] Yannis Manolopoulos,et al. Hierarchical Bitmap Index: An Efficient and Scalable Indexing Technique for Set-Valued Attributes , 2003, ADBIS.
[5] Carlchristian Eckert,et al. Enhancements of the massively parallel memory allocator ScatterAlloc and its adaption to the general interface mallocMC , 2014 .
[6] John D. Owens,et al. A Dynamic Hash Table for the GPU , 2017, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[7] Rj Allan,et al. Survey of Agent Based Modelling and Simulation Tools , 2009 .
[8] Kathryn S. McKinley,et al. Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.
[9] Stephen Jones,et al. XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.
[10] Holger Homann,et al. SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes , 2017, Comput. Phys. Commun..
[11] Michael Goesele,et al. MATOG: Array Layout Auto-Tuning for CUDA , 2017, TACO.
[12] Andreas Polze,et al. A Performance Evaluation of Dynamic Parallelism for Fine-Grained, Irregular Workloads , 2016, Int. J. Netw. Comput..
[13] Stefania Bandini,et al. Agent Based Modeling and Simulation: An Informatics Perspective , 2009, J. Artif. Soc. Soc. Simul..
[14] Xiaoming Li,et al. CUDA Memory Optimizations for Large Data-Structures in the Gravit Simulator , 2009, 2009 International Conference on Parallel Processing Workshops.
[15] M. Steinberger,et al. ScatterAlloc: Massively parallel dynamic memory allocation for the GPU , 2012, 2012 Innovative Parallel Computing (InPar).
[16] John D. Owens,et al. A Work-Efficient Step-Efficient Prefix Sum Algorithm , 2006 .
[17] Vasily Volkov,et al. Understanding Latency Hiding on GPUs , 2016 .
[18] Kenli Li,et al. Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.
[19] Marina Papatriantafilou,et al. Lock-free Concurrent Data Structures , 2013, ArXiv.
[20] M. Pharr,et al. ispc: A SPMD compiler for high-performance CPU programming , 2012, 2012 Innovative Parallel Computing (InPar).
[21] Stephen John Turner,et al. Supporting efficient execution of continuous space agent‐based simulation on GPU , 2016, Concurr. Comput. Pract. Exp..
[22] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[23] Nathan Bell,et al. Thrust: A Productivity-Oriented Library for CUDA , 2012 .
[24] Chuck Lever,et al. Malloc() Performance in a Multithreaded Linux Environment , 2000, USENIX Annual Technical Conference, FREENIX Track.
[25] Efficient Neighbor Searching for Agent-Based Simulation on GPU , 2014, 2014 IEEE/ACM 18th International Symposium on Distributed Simulation and Real Time Applications.
[26] Sophia Drossopoulou,et al. You can have it all: abstraction and good cache performance , 2017, Onward!.
[27] Ganesh Gopalakrishnan,et al. GPU Concurrency: Weak Behaviours and Programming Assumptions , 2015, ASPLOS.
[28] Kei Davis,et al. Parallel Object-Oriented Scientific Computing Today , 2003, ECOOP Workshops.
[29] Thomas Fahringer,et al. Automatic Data Layout Optimizations for GPUs , 2015, Euro-Par.
[30] Emery D. Berger,et al. A locality-improving dynamic memory allocator , 2005, MSP '05.
[31] Duane Merrill,et al. Single-pass Parallel Prefix Scan with Decoupled Lookback , 2016 .
[32] Ana Lucia Varbanescu,et al. KMA: A Dynamic Memory Manager for OpenCL , 2014, GPGPU@ASPLOS.
[33] Stephen John Turner,et al. Cloning Agent-based Simulation on GPU , 2015, SIGSIM-PADS.
[34] Robert Strzodka,et al. Abstraction for AoS and SoA layout in C , 2011 .
[35] Radek Stibora. Building of SBVH on Graphical Hardware , 2016 .
[36] Vernon Rego,et al. Efficient Algorithms for Stream Compaction on GPUs , 2017, Int. J. Netw. Comput..
[37] Vlastimil Havran,et al. Register Efficient Dynamic Memory Allocator for GPUs , 2015, Comput. Graph. Forum.
[38] Hidehiko Masuhara,et al. Ikra-Cpp: A C++/CUDA DSL for Object-Oriented Programming with Structure-of-Arrays Layout , 2018, WPMVP@PPoPP.
[39] Atsushi Ohori,et al. An efficient non-moving garbage collector for functional languages , 2011, ICFP.
[40] Julian Cummings,et al. Comparison of C++ and Fortran 90 for object-oriented scientific programming , 1997 .