Relational query processing on OpenCL-based FPGAs

The release of OpenCL support for FPGAs represents a significant improvement in extending database applications to the reconfigurable domain. Taking advantage of the programmability offered by the OpenCL HLS tool, an OpenCL database can be easily ported and re-designed for FPGAs. A single SQL query in these database systems usually consists of multiple operators, and each one of these operators in turn consists of multiple OpenCL kernels. Due to the specific properties of FPGAs, each OpenCL kernel can have different FPGA-specific optimization combinations, in terms of CU (compute unit) and SIMD (kernel vectorization), which are critical to the overall performance of query processing. Due to the resource limitation of an FPGA image, our query plan also considers the possibility of using multiple FPGA images. In this paper, we propose an FPGA-specific cost model to determine the optimal query plan in less than one minute. In particular, the FPGA synthesis time is significantly reduced by avoiding the need to evaluate all the feasible query plans on real FPGAs. Our cost model has two components: unit cost and optimal query plan generation. The first component generates multiple (unit cost, resource utilization) pairs for each kernel. The second component employs a dynamic programming approach to generate the optimal query plan which considers the possibility of using multiple FPGA images. The experiments show that 1) our cost model can accurately predict the performance of each feasible query plan for the input query, and can guide the optimal query plan generation, 2) our optimized query plan achieves a performance speedup 1.5×-4× over the state-of-the-art query processing on OpenCL-based FPGAs.

[1]  Bharat Sukhwani,et al.  Accelerating Join Operation for Relational Databases with FPGAs , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.

[2]  Paul Chow,et al.  Exploring pipe implementations using an OpenCL framework for FPGAs , 2015, 2015 International Conference on Field Programmable Technology (FPT).

[3]  Hans-Arno Jacobsen,et al.  Flexible Query Processor on FPGAs , 2013, Proc. VLDB Endow..

[4]  Wei Zhang,et al.  A study of data partitioning on OpenCL-based FPGAs , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[5]  Doris Chen,et al.  Invited paper: Using OpenCL to evaluate the efficiency of CPUS, GPUS and FPGAS for information filtering , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[6]  Jens Teubner,et al.  Data Processing on FPGAs , 2013, Proc. VLDB Endow..

[7]  Jürgen Teich,et al.  On-the-fly Composition of FPGA-Based SQL Query Accelerators Using a Partially Reconfigurable Module Library , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[8]  Bharat Sukhwani,et al.  A Hardware/Software Approach for Database Query Acceleration with FPGAs , 2014, International Journal of Parallel Programming.

[9]  Bingsheng He,et al.  In-Cache Query Co-Processing on Coupled CPU-GPU Architectures , 2014, Proc. VLDB Endow..

[10]  Jürgen Teich,et al.  A co-design approach for accelerated SQL query processing via FPGA-based data filtering , 2015, 2015 International Conference on Field Programmable Technology (FPT).

[11]  Mohammad Hosseinabady,et al.  Optimised OpenCL workgroup synthesis for hybrid ARM-FPGA devices , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[12]  Kunle Olukotun,et al.  Hardware acceleration of database operations , 2014, FPGA.

[13]  Wei Zhang,et al.  Improving Data Partitioning Performance on OpenCL-Based FPGAs , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[14]  Adrián Cristal,et al.  An empirical evaluation of High-Level Synthesis languages and tools for database acceleration , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[15]  Bingsheng He,et al.  Relational query coprocessing on graphics processors , 2009, TODS.

[16]  Gustavo Alonso,et al.  Histograms as a side effect of data movement for big data , 2014, SIGMOD Conference.

[17]  Peter M. Athanas,et al.  Enabling development of OpenCL applications on FPGA platforms , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[18]  George A. Constantinides,et al.  A Case for Work-stealing on FPGAs with OpenCL Atomics , 2016, FPGA.

[19]  Gustavo Alonso,et al.  Glacier: a query-to-hardware compiler , 2010, SIGMOD Conference.

[20]  Jim Tørresen,et al.  FPGASort: a high performance sorting architecture exploiting run-time reconfiguration on fpgas for large problem sorting , 2011, FPGA '11.

[21]  Jayme Luiz Szwarcfiter,et al.  A Structured Program to Generate all Topological Sorting Arrangements , 1974, Information Processing Letters.

[22]  Bingsheng He,et al.  Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture , 2013, Proc. VLDB Endow..

[23]  John Freeman,et al.  From opencl to high-performance hardware on FPGAS , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[24]  Wei Zhang,et al.  Accelerating Database Query Processing on OpenCL-based FPGAs (Abstract Only) , 2016, FPGA.

[25]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[26]  Wei Zhang,et al.  A performance analysis framework for optimizing OpenCL applications on FPGAs , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[27]  Wei Zhang,et al.  Melia: A MapReduce Framework on OpenCL-Based FPGAs , 2016, IEEE Transactions on Parallel and Distributed Systems.

[28]  Ralph Wittig,et al.  OpenCL library of stream memory components targeting FPGAs , 2015, 2015 International Conference on Field Programmable Technology (FPT).

[29]  Bingsheng He,et al.  OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures , 2013, Proc. VLDB Endow..

[30]  Vassilis J. Tsotras,et al.  FPGA-based Multithreading for In-Memory Hash Joins , 2015, CIDR.

[31]  Art Lew,et al.  Dynamic Programming: A Computational Tool , 2006 .

[32]  Bingsheng He,et al.  Relational joins on graphics processors , 2008, SIGMOD Conference.

[33]  Gustavo Alonso,et al.  A flexible hash table design for 10GBPS key-value stores on FPGAS , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[34]  Jens Teubner,et al.  FPGA: what's in it for a database? , 2009, SIGMOD Conference.