Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Cooperation
暂无分享,去创建一个
[1] Onur Mutlu,et al. MISE: Providing performance predictability and improving fairness in shared main memory systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[2] Rachata Ausavarungnirun,et al. RowClone: Accelerating Data Movement and Initialization Using DRAM , 2018, ArXiv.
[3] Kenneth A. Ross,et al. Navigating big data with high-throughput, energy-efficient data partitioning , 2013, ISCA.
[4] Kevin Kai-Wei Chang,et al. Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[5] Michael Stonebraker,et al. H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..
[6] Jeremie S. Kim,et al. GenStore: A High-Performance and Energy-Efficient In-Storage Computing System for Genome Sequence Analysis , 2022, ArXiv.
[7] Anastasia Ailamaki,et al. Adaptive HTAP through Elastic Resource Scheduling , 2020, SIGMOD Conference.
[8] Oscar G. Plata,et al. Enabling fast and energy-efficient FM-index exact matching using processing-near-memory , 2021, The Journal of Supercomputing.
[9] Alfons Kemper,et al. ScyPer: A Hybrid OLTP&OLAP Distributed Main Memory Database System for Scalable Real-Time Analytics , 2013, BTW.
[10] Bradford M. Beckmann,et al. The gem5 simulator , 2011, CARN.
[11] Jens Dittrich,et al. Accelerating Analytical Processing in MVCC using Fine-Granular High-Frequency Virtual Snapshotting , 2017, SIGMOD Conference.
[12] Xu Chen,et al. KunPeng: Parameter Server based Distributed Learning Systems and Its Applications in Alibaba and Ant Financial , 2017, KDD.
[13] Rachata Ausavarungnirun,et al. CoNDA: Efficient Cache Coherence Support for Near-Data Accelerators , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[14] Onur Mutlu,et al. DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks , 2021, IEEE Access.
[15] Daniel Sánchez,et al. Adaptive Scheduling for Systems with Asymmetric Memory Hierarchies , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Manos Athanassoulis,et al. Beyond the Wall: Near-Data Processing for Databases , 2015, DaMoN.
[17] Josep Torrellas,et al. Scalable Cache Miss Handling for High Memory-Level Parallelism , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[18] Sander Stuijk,et al. NAPEL: Near-Memory Computing Application Performance Prediction via Ensemble Learning , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[19] Jon T. S. Quah,et al. Real Time Credit Card Fraud Detection using Computational Intelligence , 2007, 2007 International Joint Conference on Neural Networks.
[20] Luigi Carro,et al. HIPE: HMC instruction predication extension applied on database processing , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[21] Chuan-Ming Liu,et al. Big data stream computing in healthcare real-time analytics , 2016, 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).
[22] Sally Chisholm. Adopting medical technologies and diagnostics recommended by NICE: the Health Technologies Adoption Programme. , 2014, Annals of the Royal College of Surgeons of England.
[23] Onur Mutlu,et al. Memory Performance Attacks: Denial of Memory Service in Multi-Core Systems , 2007, USENIX Security Symposium.
[24] Xuan Zhang,et al. Near-Memory Processing in Action: Accelerating Personalized Recommendation With AxDIMM , 2021, IEEE Micro.
[25] Onur Mutlu,et al. Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks , 2021, 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[26] Mahmut T. Kandemir,et al. Scheduling techniques for GPU architectures with processing-in-memory capabilities , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).
[27] Yu Wang,et al. GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[28] Norman May,et al. Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads , 2013, ADMS@VLDB.
[29] Steven Swanson,et al. Near-Data Processing: Insights from a MICRO-46 Workshop , 2014, IEEE Micro.
[30] Christian S. Jensen,et al. A comparison of the use of virtual versus physical snapshots for supporting update-intensive workloads , 2012, DaMoN '12.
[31] James E. Smith,et al. Fair Queuing Memory Systems , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[32] Maya Gokhale,et al. Combining Emulation and Simulation to Evaluate a Near Memory Key/Value Lookup Accelerator , 2021, ArXiv.
[33] Pei Liu,et al. 3D-Stacked Many-Core Architecture for Biological Sequence Analysis Problems , 2015, 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).
[34] Luca Benini,et al. A Logic-base Interconnect for Supporting Near Memory Computation in the Hybrid Memory Cube , 2014 .
[35] Matthew Poremba,et al. There and back again: Optimizing the interconnect in networks of memory cubes , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[36] Sander Stuijk,et al. Accelerating Weather Prediction Using Near-Memory Reconfigurable Fabric , 2021, ACM Trans. Reconfigurable Technol. Syst..
[37] David J. DeWitt,et al. Materialization Strategies in a Column-Oriented DBMS , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[38] Alfons Kemper,et al. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems , 2015, SIGMOD Conference.
[39] Onur Mutlu,et al. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[40] Peter C. Ma,et al. Ten Lessons From Three Generations Shaped Google’s TPUv4i : Industrial Product , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[41] Izzat El Hajj,et al. Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture , 2021, ArXiv.
[42] Srinivas Devadas,et al. TicToc: Time Traveling Optimistic Concurrency Control , 2016, SIGMOD Conference.
[43] Yuhwan Ro,et al. Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond , 2021, 2021 IEEE Hot Chips 33 Symposium (HCS).
[44] Zhengping Qian,et al. Real-time Constrained Cycle Detection in Large Dynamic Graphs , 2018, Proc. VLDB Endow..
[45] O Seongil,et al. 25.4 A 20nm 6GB Function-In-Memory DRAM, Based on HBM2 with a 1.2TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).
[46] Onur Mutlu,et al. The DRAM Latency PUF: Quickly Evaluating Physical Unclonable Functions by Exploiting the Latency-Reliability Tradeoff in Modern Commodity DRAM Devices , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[47] Onur Mutlu,et al. Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[48] Norman May,et al. Scaling Up Mixed Workloads: A Battle of Data Freshness, Flexibility, and Scheduling , 2014, TPCTC.
[49] Wolfgang Lehner,et al. SAP HANA: The Evolution from a Modern Main-Memory Data Platform to an Enterprise Application Platform , 2013, Proc. VLDB Endow..
[50] Jim Gray,et al. A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.
[51] Michael Stonebraker,et al. Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores , 2014, Proc. VLDB Endow..
[52] Mustafa Canim,et al. L-Store: A Real-time OLTP and OLAP System , 2016, EDBT.
[53] Onur Mutlu,et al. Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[54] Reetuparna Das,et al. Application-to-core mapping policies to reduce memory system interference in multi-core systems , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[55] Goetz Graefe,et al. The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.
[56] Onur Mutlu,et al. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[57] Il Memming Park,et al. A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-in-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications , 2022, 2022 IEEE International Solid- State Circuits Conference (ISSCC).
[58] Sai Prashanth Muralidhara,et al. Reducing memory interference in multicore systems via application-aware memory channel partitioning , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[59] Onur Mutlu,et al. Main Memory Scaling: Challenges and Solution Directions , 2015 .
[60] Onur Mutlu,et al. Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems , 2008, 2008 International Symposium on Computer Architecture.
[61] Parthasarathy Ranganathan,et al. Warehouse-scale video acceleration: co-design and deployment in the wild , 2021, ASPLOS.
[62] Onur Mutlu,et al. Improving memory Bank-Level Parallelism in the presence of prefetching , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[63] O Seongil,et al. Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology : Industrial Product , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[64] Maurice Herlihy,et al. Concurrent Data Structures for Near-Memory Computing , 2017, SPAA.
[65] Gabriel H. Loh,et al. 3D-Stacked Memory Architectures for Multi-core Processors , 2008, 2008 International Symposium on Computer Architecture.
[66] Onur Mutlu,et al. A Workload and Programming Ease Driven Perspective of Processing-in-Memory , 2019, ArXiv.
[67] Kevin Wilkinson,et al. Janus: Transaction Processing of Navigation and Analytic Graph Queries on Many-core Servers , 2017, CIDR.
[68] Onur Mutlu,et al. BLISS: Balancing Performance, Fairness and Complexity in Memory Access Scheduling , 2016, IEEE Transactions on Parallel and Distributed Systems.
[69] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[70] Craig Freedman,et al. Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.
[71] Rachata Ausavarungnirun,et al. Processing Data Where It Makes Sense: Enabling In-Memory Computation , 2019, Microprocess. Microsystems.
[72] Alexander Zeier,et al. HYRISE - A Main Memory Hybrid Storage Engine , 2010, Proc. VLDB Endow..
[73] Divyakant Agrawal,et al. Janus: A Hybrid Scalable Multi-Representation Cloud Datastore , 2018, IEEE Transactions on Knowledge and Data Engineering.
[74] Onur Mutlu,et al. PiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAM , 2021, ArXiv.
[75] Babak Falsafi,et al. The mondrian data engine , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[76] Onur Mutlu,et al. D-RaNGe: Using Commodity DRAM Devices to Generate True Random Numbers with Low Latency and High Throughput , 2018, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[77] Onur Mutlu,et al. Runahead execution: an alternative to very large instruction windows for out-of-order processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[78] Hamid Pirahesh,et al. WiSer: A Highly Available HTAP DBMS for IoT Applications , 2019, 2019 IEEE International Conference on Big Data (Big Data).
[79] Onur Mutlu,et al. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[80] Torsten Hoefler,et al. SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems , 2021, MICRO.
[81] Babak Falsafi,et al. Sort vs. Hash Join Revisited for Near-Memory Execution , 2015 .
[82] Yu Xia,et al. Taurus: Lightweight Parallel Logging for In-Memory Database Management Systems (Extended Version) , 2020, VLDB 2020.
[83] Onur Mutlu,et al. Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).
[84] Peter M. Kogge,et al. EXECUBE-A New Architecture for Scaleable MPPs , 1994, 1994 International Conference on Parallel Processing Vol. 1.
[85] Christoforos E. Kozyrakis,et al. Practical Near-Data Processing for In-Memory Analytics Frameworks , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[86] Kenneth A. Ross,et al. Q100: the architecture and design of a database processing unit , 2014, ASPLOS.
[87] Irving L. Traiger,et al. The notions of consistency and predicate locks in a database system , 1976, CACM.
[88] Jing Wang,et al. Processing-in-Memory Enabled Graphics Processors for 3D Rendering , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[89] Feifei Li,et al. Fixed-function hardware sorting accelerators for near data MapReduce execution , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).
[90] Yanzhi Wang,et al. GraphQ: Scalable PIM-Based Graph Processing , 2019, MICRO.
[91] Viktor Leis,et al. Compiling Database Queries into Machine Code , 2014, IEEE Data Eng. Bull..
[92] Feifei Li,et al. NDC: Analyzing the impact of 3D-stacked memory+logic devices on MapReduce workloads , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[93] Onur Mutlu,et al. Accelerating Dependent Cache Misses with an Enhanced Memory Controller , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[94] Luigi Carro,et al. Processing in 3D memories to speed up operations on complex data structures , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[95] Onur Mutlu,et al. SIMDRAM: a framework for bit-serial SIMD processing using DRAM , 2020, ASPLOS.
[96] Alexander Zeier,et al. Speeding Up Queries in Column Stores - A Case for Compression , 2010, DaWak.
[97] Franz Franchetti,et al. Data reorganization in memory using 3D-stacked DRAM , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[98] Mor Harchol-Balter,et al. ATLAS : A Scalable and High-Performance Scheduling Algorithm for Multiple Memory Controllers , 2010 .
[99] Juliana Freire,et al. Virtual lightweight snapshots for consistent analytics in NoSQL stores , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).
[100] Marcin Zukowski,et al. MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.
[101] Jaeha Kim,et al. Memory-centric system interconnect design with Hybrid Memory Cubes , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[102] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[103] Onur Mutlu,et al. Techniques for efficient processing in runahead execution engines , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[104] Christoforos E. Kozyrakis,et al. GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[105] Anastasia Ailamaki,et al. Performance Characterization of HTAP Workloads , 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE).
[106] Onur Mutlu,et al. pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables , 2021, 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO).
[107] Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems , 2010, ASPLOS XV.
[108] Jialiang Zhang,et al. MEG: A RISCV-based System Emulation Infrastructure for Near-data Processing Using FPGAs and High-bandwidth Memory , 2020, ACM Trans. Reconfigurable Technol. Syst..
[109] Setrag Khoshafian,et al. A decomposition storage model , 1985, SIGMOD Conference.
[110] Brian Fahs,et al. Microarchitecture optimizations for exploiting memory-level parallelism , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[111] Babak Falsafi,et al. Near-Memory Address Translation , 2016, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[112] Rachata Ausavarungnirun,et al. Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks , 2018, ASPLOS.
[113] Weiyun Huang,et al. Real-Time Analytical Processing with SQL Server , 2015, Proc. VLDB Endow..
[114] Onur Mutlu,et al. Efficient Runahead Execution: Power-Efficient Memory Latency Tolerance , 2006, IEEE Micro.
[115] Michael Stonebraker,et al. C-Store: A Column-oriented DBMS , 2005, VLDB.
[116] Damla Senol Cali,et al. FPGA-Based Near-Memory Acceleration of Modern Data-Intensive Applications , 2021, IEEE Micro.
[117] Janak H. Patel,et al. A low-overhead coherence solution for multiprocessors with private cache memories , 1984, ISCA '84.
[118] Qi Liu,et al. TiDB , 2020, Proc. VLDB Endow..
[119] Rachata Ausavarungnirun,et al. A Modern Primer on Processing in Memory , 2020, ArXiv.
[120] J. Jeddeloh,et al. Hybrid memory cube new DRAM architecture increases density and performance , 2012, 2012 Symposium on VLSI Technology (VLSIT).
[121] Kevin Kai-Wei Chang,et al. DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators , 2016, ACM Trans. Archit. Code Optim..
[122] Jayanthi Ranjan,et al. Real time business intelligence in supply chain analytics , 2008, Inf. Manag. Comput. Secur..
[123] Hyesoon Kim,et al. BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[124] Rachata Ausavarungnirun,et al. RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[125] Onur Mutlu,et al. In-DRAM Bulk Bitwise Execution Engine , 2019, ArXiv.
[126] Ramyad Hadidi,et al. GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[127] Rachata Ausavarungnirun,et al. GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[128] Gustavo Alonso,et al. BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications , 2017, SIGMOD Conference.
[129] Yuanyuan Tian,et al. Hybrid Transactional/Analytical Processing: A Survey , 2017, SIGMOD Conference.
[130] Nectarios Koziris,et al. SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures , 2021, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[131] Bingsheng He,et al. Database compression on graphics processors , 2010, Proc. VLDB Endow..
[132] David Wentzlaff,et al. ComputeDRAM: In-Memory Compute Using Off-the-Shelf DRAMs , 2019, MICRO.
[133] Ramyad Hadidi,et al. CAIRO , 2017, ACM Trans. Archit. Code Optim..
[134] Paul Feautrier,et al. A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.
[135] Tejas Karkhanis,et al. Active Memory Cube: A processing-in-memory architecture for exascale systems , 2015, IBM J. Res. Dev..
[136] Jeremie S. Kim,et al. FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[137] Ran Ginosar,et al. GP-SIMD Processing-in-Memory , 2015, ACM Trans. Archit. Code Optim..
[138] Carsten Binnig,et al. Dictionary-based order-preserving string compression for main memory column stores , 2009, SIGMOD Conference.
[139] Shaahin Angizi,et al. GraphiDe: A Graph Processing Accelerator leveraging In-DRAM-Computing , 2019, ACM Great Lakes Symposium on VLSI.
[140] Onur Mutlu,et al. Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM , 2016, ArXiv.
[141] Duncan G. Elliott,et al. Computational RAM: Implementing Processors in Memory , 1999, IEEE Des. Test Comput..
[142] Onur Mutlu,et al. Casper: Accelerating Stencil Computation using Near-cache Processing , 2021, ArXiv.
[143] Maya Gokhale,et al. In-Memory Data Rearrangement for Irregular, Data-Intensive Computing , 2015, Computer.
[144] Philip A. Bernstein,et al. Categories and Subject Descriptors: H.2.4 [Database Management]: Systems. , 2022 .
[145] Jongmoo Choi,et al. Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[146] Onur Mutlu,et al. Memory scaling: A systems architecture perspective , 2013, 2013 5th IEEE International Memory Workshop.
[147] Mor Harchol-Balter,et al. Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[148] Yuan Qi,et al. TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial , 2019, Proc. VLDB Endow..
[149] Steffen Zeuch,et al. QTM: Modelling Query Execution with Tasks , 2014, ADMS@VLDB.
[150] Norman May,et al. The SAP HANA Database -- An Architecture Overview , 2012, IEEE Data Eng. Bull..
[151] Anastasia Ailamaki,et al. The Case For Heterogeneous HTAP , 2017, CIDR.
[152] Onur Mutlu,et al. Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[153] Srinivas Devadas,et al. Sundial: Harmonizing Concurrency Control and Caching in a Distributed OLTP Database Management System , 2018, Proc. VLDB Endow..
[154] Viktor Leis,et al. Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age , 2014, SIGMOD Conference.
[155] Edward D. Lazowska,et al. A Comparison of Receiver-Initiated and Sender-Initiated Adaptive Load Sharing , 1986, Perform. Evaluation.
[156] Masoud Daneshtalab,et al. NoM: Network-on-Memory for Inter-Bank Data Transfer in Highly-Banked Memories , 2020, IEEE Computer Architecture Letters.
[157] Andrew F. Glew. MLP yes! ILP no , 1998, ASPLOS 1998.
[158] Kiyoung Choi,et al. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[159] Wook-Shin Han,et al. Parallel Replication across Formats in SAP HANA for Scaling Out Mixed OLTP/OLAP Workloads , 2017, Proc. VLDB Endow..
[160] Maya Gokhale,et al. Near memory key/value lookup acceleration , 2017, MEMSYS.
[161] Fabrice Devaux,et al. The true Processing In Memory accelerator , 2019, 2019 IEEE Hot Chips 31 Symposium (HCS).
[162] Luigi Carro,et al. Operand size reconfiguration for big data processing in memory , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[163] L. V. Gutierrez,et al. ASIC Clouds: Specializing the Datacenter , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[164] Onur Mutlu,et al. GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies , 2017, BMC Genomics.
[165] Franz Franchetti,et al. Accelerating sparse matrix-matrix multiplication with 3D-stacked logic-in-memory hardware , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).
[166] Onur Mutlu,et al. LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory , 2017, IEEE Computer Architecture Letters.
[167] Thomas Neumann,et al. TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark , 2013, TPCTC.
[168] Eitan Medina,et al. Habana Labs Purpose-Built AI Inference and Training Processor Architectures: Scaling AI Training Systems Using Standard Ethernet With Gaudi Processor , 2020, IEEE Micro.
[169] Zhe Zhang,et al. 184QPS/W 64Mb/mm23D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System , 2022, IEEE International Solid-State Circuits Conference.
[170] Onur Mutlu,et al. LazyPIM: Efficient Support for Cache Coherence in Processing-in-Memory Architectures , 2017, ArXiv.
[171] Onur Mutlu,et al. GRIM-filter: fast seed filtering in read mapping using emerging memory technologies , 2017, 1708.04329.
[172] William H. Kautz,et al. Cellular Logic-in-Memory Arrays , 1969, IEEE Transactions on Computers.
[173] Dinesh Das,et al. Oracle Database In-Memory: A dual format in-memory database , 2015, 2015 IEEE 31st International Conference on Data Engineering.
[174] Peter Bumbulis,et al. Towards Scalable Real-time Analytics: An Architecture for Scale-out of OLxP Workloads , 2015, Proc. VLDB Endow..
[175] Onur Mutlu,et al. Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware , 2021, 2021 12th International Green and Sustainable Computing Conference (IGSC).
[176] Maya Gokhale,et al. Towards a scatter-gather architecture: hardware and software issues , 2019, MEMSYS.
[177] Mohammad Sadoghi,et al. Hybrid OLTP and OLAP , 2019, Encyclopedia of Big Data Technologies.
[178] Frederic T. Chong,et al. Active pages: a computation model for intelligent memory , 1998, ISCA.
[179] Onur Mutlu,et al. QUAC-TRNG: High-Throughput True Random Number Generation Using Quadruple Row Activation in Commodity DRAM Chips , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[180] Mingyu Gao,et al. HRL: Efficient and flexible reconfigurable logic for near-data processing , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[181] Luca Benini,et al. Logic-Base Interconnect Design for Near Memory Computing in the Smart Memory Cube , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[182] Stratos Idreos,et al. JAFAR: Near-Data Processing for Databases , 2015, SIGMOD Conference.
[183] Luigi Carro,et al. NIM: An HMC-Based Machine for Neuron Computation , 2017, ARC.
[184] Mike Ignatowski,et al. TOP-PIM: throughput-oriented programmable processing in memory , 2014, HPDC '14.
[185] Francisco J. Cazorla,et al. Multicore Resource Management , 2008, IEEE Micro.
[186] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[187] Chun Chen,et al. The architecture of the DIVA processing-in-memory chip , 2002, ICS '02.
[188] Onur Mutlu,et al. Research Problems and Opportunities in Memory Systems , 2014, Supercomput. Front. Innov..
[189] Goetz Graefe,et al. Encapsulation of parallelism in the Volcano query processing system , 1990, SIGMOD '90.
[190] Maya Gokhale,et al. Near memory data structure rearrangement , 2015, MEMSYS.
[191] Lavanya Subramanian,et al. Providing High and Controllable Performance in Multicore Systems Through Shared Resource Management , 2015, ArXiv.
[192] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.
[193] Amirali Boroumand. Practical Mechanisms for Reducing Processor-Memory Data Movement in Modern Workloads , 2021 .
[194] Michael J. Cahill. Serializable isolation for snapshot databases , 2009, TODS.
[195] Sudhakar Yalamanchili,et al. Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[196] Alfons Kemper,et al. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.
[197] Onur Mutlu,et al. A Case for MLP-Aware Cache Replacement , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[198] Luca Benini,et al. High performance AXI-4.0 based interconnect for extensible smart memory cubes , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[199] Thomas Neumann,et al. Efficiently Compiling Efficient Query Plans for Modern Hardware , 2011, Proc. VLDB Endow..
[200] Maya Gokhale,et al. Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.
[201] Onur Mutlu,et al. Processing-in-memory: A workload-driven perspective , 2019, IBM J. Res. Dev..
[202] Onur Mutlu,et al. Continuous runahead: Transparent hardware acceleration for memory intensive workloads , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[203] Oracle GoldenGate 12c: Real-Time Access to Real-Time Information , 2014 .
[204] Pradeep Dubey,et al. Fast Updates on Read-Optimized Databases Using Multi-Core CPUs , 2011, Proc. VLDB Endow..
[205] Rasit O. Topaloglu,et al. More than Moore Technologies for Next Generation Computer Design , 2015 .
[206] Andrew Pavlo,et al. Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads , 2016, SIGMOD Conference.
[207] Harold S. Stone,et al. A Logic-in-Memory Computer , 1970, IEEE Transactions on Computers.
[208] Onur Mutlu,et al. Accelerating Genome Analysis: A Primer on an Ongoing Journey , 2020, IEEE Micro.
[209] Yu Huang,et al. A Heterogeneous PIM Hardware-Software Co-Design for Energy-Efficient Graph Processing , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[210] Babak Falsafi,et al. Meet the walkers accelerating index traversals for in-memory databases , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[211] Jung Ho Ahn,et al. NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[212] Oscar Plata,et al. NATSA: A Near-Data Processing Accelerator for Time Series Analysis , 2020, 2020 IEEE 38th International Conference on Computer Design (ICCD).
[213] Jinyoung Lee,et al. Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[214] Bahar Asgari,et al. Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube , 2017, 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[215] Barzan Mozafari,et al. SnappyData : Streaming , Transactions , and Interactive Analytics in a Unified Engine , 2016 .
[216] Franz Franchetti,et al. HAMLeT: Hardware accelerated memory layout transform within 3D-stacked DRAM , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).
[217] E. F. Codd,et al. The Relational Model for Database Management, Version 2 , 1990 .
[218] Gwangsun Kim,et al. Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[219] Norman May,et al. Scaling Up Concurrent Main-Memory Column-Store Scans: Towards Adaptive NUMA-aware Data and Task Placement , 2015, Proc. VLDB Endow..