RAMPS: Reconfigurable architecture for minimal perfect sequencing using the Convey hybrid core computer

The alignment of many short sequences of DNA, called reads, to a long reference genome is a common task in molecular biology. When the problem is expanded to handle typical workloads of billions of reads, execution time becomes critical. While existing solutions attempt to align a high percentage of the reads using a small memory footprint, RAMPS (Reconfigurable Architecture for Minimal Perfect Sequencing) focuses on perform fast exact matching. Using the human genome as a reference, RAMPS aligns short reads on the order of hundreds of thousands of times faster than current software implementations such as SOAP2 or Bowtie, and about 1000 times faster than GPU implementations such as SOAP3. Whereas other aligners require hours to preprocess reference genomes, RAMPS can preprocess the human genome in a few minutes, opening doors via the ability to use arbitrary reference sources for alignment and increasing the amount of data that aligns with the reference.

[1]  Rainer G. Spallek,et al.  Next-generation massively parallel short-read mapping on FPGAs , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.

[2]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[3]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[4]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[5]  Siu-Ming Yiu,et al.  SOAP3: ultra-fast GPU-based parallel alignment tool for short reads , 2012, Bioinform..

[6]  S. Nelson,et al.  BFAST: An Alignment Tool for Large Scale Genome Resequencing , 2009, PloS one.

[7]  K. Mullis The Polymerase Chain Reaction (Nobel Lecture). , 1994 .

[8]  Stephen A. Edwards,et al.  MEMOCODE 2012 hardware/software codesign contest: DNA sequence aligner , 2012, Tenth ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMCODE2012).

[9]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[10]  Kimmo Fredriksson,et al.  Simple Compression Code Supporting Random Access and Fast String Matching , 2007, WEA.

[11]  C. Oste,et al.  Polymerase Chain Reaction , 2002 .

[12]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[13]  Timothy B. Stockwell,et al.  The Diploid Genome Sequence of an Individual Human , 2007, PLoS biology.

[14]  Liqing Zhang,et al.  GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors , 2010, 2010 13th IEEE International Conference on Computational Science and Engineering.

[15]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[16]  Fabiano C. Botelho,et al.  Near-Optimal Space Perfect Hashing Algorithms , 2009 .

[17]  J.D. Watson,et al.  Reprint: Molecular Structure of Nucleic Acids , 2003, Annals of Internal Medicine.

[18]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[19]  Martin Dietzfelbinger,et al.  Hash, Displace, and Compress , 2009, ESA.

[20]  Rasmus Pagh,et al.  Practical perfect hashing in nearly optimal space , 2013, Inf. Syst..

[21]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[22]  Milad Gholami,et al.  Fast CPU-based DNA exact sequence aligner , 2012, Tenth ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMCODE2012).

[23]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.