AlignS: A Processing-In-Memory Accelerator for DNA Short Read Alignment Leveraging SOT-MRAM

Classified as a complex big data analytics problem, DNA short read alignment serves as a major sequential bottleneck to massive amounts of data generated by next-generation sequencing platforms. With Von-Neumann computing architectures struggling to address such computationally-expensive and memory-intensive task today, Processing-in-Memory (PIM) platforms are gaining growing interests. In this paper, an energy-efficient and parallel PIM accelerator (AlignS) is proposed to execute DNA short read alignment based on an optimized and hardware-friendly alignment algorithm. We first develop AlignS’s platform that harnesses SOT-MRAM as computational memory and transforms it to a fundamental processing unit for short read alignment. Accordingly, we present a novel, customized, highly parallel read alignment algorithm that only seeks the proposed simple and parallel in-memory operations (i.e. comparisons and additions). AlignS is then optimized through a new correlated data partitioning and mapping methodology that allows local storage and processing of DNA sequence to fully exploit the algorithm-level’s parallelism, and to accelerate both exact and inexact matches. The device-to-architecture co-simulation results show that AlignS improves the short read alignment throughput per Watt per mm2 by ~12× compared to the ASIC accelerator. Compared to recent FM-index-based ReRAM platform, AlignS achieves 1.6× higher throughput per Watt.

[1]  Shaahin Angizi,et al.  CMP-PIM: An Energy-Efficient Comparator-based Processing-In-Memory Neural Network Accelerator , 2018, Design Automation Conference.

[2]  Kaushik Roy,et al.  Spin-Transfer Torque Devices for Logic and Memory: Prospects and Perspectives , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Hamid R. Zarandi,et al.  AligneR: A Process-in-Memory Architecture for Short Read Alignment in ReRAMs , 2018, IEEE Computer Architecture Letters.

[4]  Thomas K. F. Wong,et al.  SOAP3-dp: Fast, Accurate and Sensitive GPU-Based Short Read Aligner , 2013, PloS one.

[5]  Chris H. Kim,et al.  A Non-volatile Near-Memory Read Mapping Accelerator , 2017, ArXiv.

[6]  Ran Ginosar,et al.  A Resistive CAM Processing-in-Storage Architecture for DNA Sequence Alignment , 2017, IEEE Micro.

[7]  Wayne Luk,et al.  Leveraging FPGAs for Accelerating Short Read Alignment , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  William J. Dally,et al.  Darwin: A Genomics Co-processor Provides up to 15,000X Acceleration on Long Read Assembly , 2018, USENIX Annual Technical Conference.

[9]  Cong Xu,et al.  Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Yusuf Leblebici,et al.  A capacitive threshold-logic gate , 1996, IEEE J. Solid State Circuits.

[11]  Ran Ginosar,et al.  Resistive Associative Processor , 2015, IEEE Computer Architecture Letters.

[12]  Cong Xu,et al.  NVSim-CAM: A circuit-level simulator for emerging nonvolatile memory based Content-Addressable Memory , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[13]  Steven Salzberg,et al.  Short Read Mapping: An Algorithmic Tour , 2017, Proceedings of the IEEE.

[14]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[15]  Dmitri B. Strukov,et al.  Race Logic: A hardware acceleration for dynamic programming algorithms , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[16]  Chia-Hsiang Yang,et al.  A 135-mW Fully Integrated Data Processor for Next-Generation Sequencing , 2017, IEEE Transactions on Biomedical Circuits and Systems.

[17]  Michael Immediato,et al.  Enchanced multi-threshold (MTCMOS) circuits using variable well bias , 2001, ISLPED '01.

[18]  Shaahin Angizi,et al.  DIMA: A Depthwise CNN In-Memory Accelerator , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[19]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[20]  Pran Kurup,et al.  Logic Synthesis Using Synopsys , 1995 .

[21]  Onur Mutlu,et al.  Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Yuan Xie,et al.  RADAR: A 3D-ReRAM based DNA Alignment Accelerator Architecture , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[23]  Leping Li,et al.  ART: a next-generation sequencing read simulator , 2012, Bioinform..

[24]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .