Efficient query processing in crossbar memory

Today's computing systems use huge amount of energy and time to process basic queries in database. A large part of it is spent in data movement between the memory and processing cores, owing to the limited cache capacity and memory bandwidth of traditional computers. In this paper, we propose a non-volatile memory-based query accelerator, called NVQuery, which performs several basic query functions in memory including aggregation, prediction, bit-wise operations, as well as exact and nearest distance search queries. NVQuery is implemented on a content addressable memory (CAM) and exploits the analog characteristic of non-volatile memory in order to enable in-memory processing. To implement nearest distance search in memory, we introduce a novel bitline driving scheme to give weights to the indices of the bits during the search operation. Our experimental evaluation shows that, NVQuery can provide 49.3× performance speedup and 32.9× energy savings as compared to running the same query on traditional processor. In addition, compared to the state-of-the-art query accelerators, NVQuery can achieve 26.2× energy-delay product improvement while providing the similar accuracy.

[1]  Steven Swanson,et al.  Near-Data Processing: Insights from a MICRO-46 Workshop , 2014, IEEE Micro.

[2]  Mohsen Imani,et al.  Ultra-efficient processing in-memory for data intensive applications , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[3]  Nishil Talati,et al.  Logic Design Within Memristive Memories Using Memristor-Aided loGIC (MAGIC) , 2016, IEEE Transactions on Nanotechnology.

[4]  M. Anusha,et al.  Big Data-Survey , 2016 .

[5]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[6]  Tajana Simunic,et al.  MPIM: Multi-purpose in-memory processing using configurable resistive memory , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[7]  Tajana Rosing,et al.  Resistive CAM Acceleration for Tunable Approximate Computing , 2019, IEEE Transactions on Emerging Topics in Computing.

[8]  Mehdi Saremi Carrier mobility extraction method in ChGs in the UV light exposure , 2016 .

[9]  David R. Kaeli,et al.  Exploring the multiple-GPU design space , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[10]  Farinaz Koushanfar,et al.  LookNN: Neural network with no multiplication , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[11]  Jan M. Rabaey,et al.  Exploring Hyperdimensional Associative Memory , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[12]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[13]  Jordan Tigani,et al.  Google BigQuery Analytics , 2014 .

[14]  Mohammad Arjomand,et al.  Reducing access latency of MLC PCMs through line striping , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[15]  John E. Stone,et al.  An asymmetric distributed shared memory model for heterogeneous parallel systems , 2010, ASPLOS XV.

[16]  Ronald F. DeMara,et al.  Variation-immune resistive Non-Volatile Memory using self-organized sub-bank circuit designs , 2017, 2017 18th International Symposium on Quality Electronic Design (ISQED).

[17]  Kim M. Hazelwood,et al.  Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[18]  Jignesh M. Patel,et al.  DAQ: A New Paradigm for Approximate Query Processing , 2015, Proc. VLDB Endow..

[19]  Feifei Li,et al.  Comparing Implementations of Near-Data Computing with In-Memory MapReduce Workloads , 2014, IEEE Micro.

[20]  Tajana Simunic,et al.  ACAM: Approximate Computing Based on Adaptive Associative Memory with Online Learning , 2016, ISLPED.

[21]  Viswanath Poosala,et al.  Aqua: A Fast Decision Support Systems Using Approximate Query Answers , 1999, VLDB.

[22]  Hakan Hacigümüs,et al.  MISO: souping up big data query processing with a multistore system , 2014, SIGMOD Conference.

[23]  Ion Stoica,et al.  BlinkDB: queries with bounded errors and bounded response times on very large data , 2012, EuroSys '13.

[24]  Eby G. Friedman,et al.  VTEAM – A General Model for Voltage Controlled Memristors , 2014 .

[25]  Matei Ripeanu,et al.  StoreGPU: exploiting graphics processing units to accelerate distributed storage systems , 2008, HPDC '08.

[26]  Kevin Skadron,et al.  Accelerating SQL database operations on a GPU with CUDA , 2010, GPGPU-3.