Concurrent Data Structures with Near-Data-Processing: an Architecture-Aware Implementation
暂无分享,去创建一个
Maurice Herlihy | R. Iris Bahar | Jiwon Choe | Tali Moreshet | R. I. Bahar | Amy Huang | M. Herlihy | Jiwon Choe | T. Moreshet | Amy Huang
[1] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[2] Onur Mutlu,et al. GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies , 2017, BMC Genomics.
[3] Kees G. W. Goossens,et al. Improved Power Modeling of DDR SDRAMs , 2011, 2011 14th Euromicro Conference on Digital System Design.
[4] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[5] Scott A. Mahlke,et al. In-Memory Data Parallel Processor , 2018, ASPLOS.
[6] Christoforos E. Kozyrakis,et al. GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[7] Song Jiang,et al. Wormhole: A Fast Ordered Index for In-memory Data Management , 2018 .
[8] Henri-Pierre Charles,et al. Micro-architectural simulation of embedded core heterogeneity with gem5 and McPAT , 2015, RAPIDO '15.
[9] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[10] Maurice Herlihy,et al. Concurrent Data Structures for Near-Memory Computing , 2017, SPAA.
[11] Ki-Seok Chung,et al. HMC-MAC: Processing-in Memory Architecture for Multiply-Accumulate Operations with Hybrid Memory Cube , 2018, IEEE Computer Architecture Letters.
[12] Luigi Carro,et al. Processing in 3D memories to speed up operations on complex data structures , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[13] Feifei Li,et al. NDC: Analyzing the impact of 3D-stacked memory+logic devices on MapReduce workloads , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[14] Rachata Ausavarungnirun,et al. Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks , 2018, ASPLOS.
[15] Kiyoung Choi,et al. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[16] William Pugh,et al. Skip Lists: A Probabilistic Alternative to Balanced Trees , 1989, WADS.
[17] Onur Mutlu,et al. Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).
[18] Luca Benini,et al. Design and Evaluation of a Processing-in-Memory Architecture for the Smart Memory Cube , 2016, ARCS.
[19] Bruce Jacob,et al. Memory Systems: Cache, DRAM, Disk , 2007 .
[20] Hyesoon Kim,et al. Instruction Offloading with HMC 2.0 Standard: A Case Study for Graph Traversals , 2015, MEMSYS.
[21] Eric Ruppert,et al. Lock-free linked lists and skip lists , 2004, PODC '04.
[22] Ramyad Hadidi,et al. GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[23] Onur Mutlu,et al. LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory , 2017, IEEE Computer Architecture Letters.
[24] Christoforos E. Kozyrakis,et al. Practical Near-Data Processing for In-Memory Analytics Frameworks , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[25] Reetuparna Das,et al. Exploring specialized near-memory processing for data intensive operations , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[26] C. Martin. 2015 , 2015, Les 25 ans de l’OMC: Une rétrospective en photos.
[27] Maged M. Michael,et al. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.
[28] Jung Ho Ahn,et al. Accelerating linked-list traversal through near-data processing , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).
[29] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[30] Maurice Herlihy,et al. A Lazy Concurrent List-Based Set Algorithm , 2007, Parallel Process. Lett..
[31] Nir Shavit,et al. Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.
[32] Mike Ignatowski,et al. TOP-PIM: throughput-oriented programmable processing in memory , 2014, HPDC '14.
[33] Hemangee K. Kapoor,et al. Towards Near Data Processing of Convolutional Neural Networks , 2018, 2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID).
[34] Nam Sung Kim,et al. Fine-Grained Task Migration for Graph Algorithms Using Processing in Memory , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[35] Tajana Simunic,et al. GenPIM: Generalized processing in-memory to accelerate data intensive applications , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[36] Guangyu Sun,et al. PM3: Power Modeling and Power Management for Processing-in-Memory , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).