论文信息 - Catalina: In-Storage Processing Acceleration for Scalable Big Data Analytics

Catalina: In-Storage Processing Acceleration for Scalable Big Data Analytics

Cloud applications are increasingly playing a crucial role in big data analytics. New use cases such as autonomous cars and edge computing call for novel approaches mixing heterogeneous computing and machine learning. These applications typically process petabyte-scale datasets, therefore, requiring low-power and scalable storage providing low-latency and high-throughput data access. While data centers have been focusing on migrating from legacy HDDs and SATA SSDs by deploying high-throughput and low-latency NVMe SSDs, the data bottlenecks appear as capacity scales. One approach to tackle this problem is to enable processing to happen within the storage device -in-storage processing (ISP)- eliminating the need to move the data. In this paper, we investigated the deployment of storage units with embedded low-power application processors along with FPGA-based reconfigurable hardware accelerators to address both performance and energy efficiency. To this purpose, we developed a high-capacity solid-state drive (SSD) named Catalina equipped with a quad-core ARM A53 processor running a Linux operating system along with a highly efficient FPGA accelerator for running applications in-place. We evaluated our proposed approach on a case study application for a similarity search library called Faiss.

[1] Jeff Johnson,et al. Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[2] Tao Xie,et al. RISP: A Reconfigurable In-Storage Processing Framework with Energy-Awareness , 2018, 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[3] Nader Bagherzadeh,et al. CompStor: An In-storage Computation Platform for Scalable Distributed Processing , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[4] Ran Ginosar,et al. PRINS: Processing-in-Storage Acceleration of Machine Learning , 2018, IEEE Transactions on Nanotechnology.

[5] Alexander V. Veidenbaum,et al. Data-rate-aware FPGA-based acceleration framework for streaming applications , 2016, 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

[6] Sungjin Lee,et al. BlueDBM: An appliance for Big Data analytics , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[7] Chanik Park,et al. Enabling cost-effective data processing with smart SSD , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[8] Elaheh Bozorgzadeh,et al. Scalable Multi-Queue Data Transfer Scheme for FPGA-Based Multi-Accelerators , 2018, 2018 IEEE 36th International Conference on Computer Design (ICCD).

[9] Sang-Won Lee,et al. In-storage processing of database scans and joins , 2016, Inf. Sci..

[10] Ming Liu,et al. Scalable multi-access flash store for big data analytics , 2014, FPGA.

[11] Jinyoung Lee,et al. Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).