SCALCE: boosting sequence compression algorithms using locally consistent encoding
暂无分享,去创建一个
Faraz Hach | Süleyman Cenk Sahinalp | Can Alkan | Ibrahim Numanagic | C. Alkan | Faraz Hach | S. C. Sahinalp | Ibrahim Numanagić
[1] Szymon Grabowski,et al. Compression of DNA sequence reads in FASTQ format , 2011, Bioinform..
[2] Vladimir Yanovsky. ReCoil - an algorithm for compression of extremely large datasets of dna data , 2010, Algorithms for Molecular Biology.
[3] James Lowey,et al. Bioinformatics Applications Note Sequence Analysis G-sqz: Compact Encoding of Genomic Sequence and Quality Data , 2022 .
[4] Elizabeth M. Smigielski,et al. dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..
[5] M. DePristo,et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.
[6] Markus Hsi-Yang Fritz,et al. Efficient storage of high throughput DNA sequencing data using reference-based compression. , 2011, Genome research.
[7] References , 1971 .
[8] P. Green,et al. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.
[9] Uzi Vishkin,et al. Communication complexity of document exchange , 1999, SODA '00.
[10] Gonçalo R. Abecasis,et al. The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..
[11] Abraham Lempel,et al. A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.
[12] Funda Ergün,et al. Oblivious string embeddings and edit distance approximations , 2006, SODA '06.
[13] Raffaele Giancarlo,et al. Boosting textual compression in optimal linear time , 2005, JACM.
[14] D. J. Wheeler,et al. A Block-sorting Lossless Data Compression Algorithm , 1994 .
[15] Raffaele Giancarlo,et al. The Engineering of a Compression Boosting Library: Theory vs Practice in BWT Compression , 2006, ESA.
[16] Giovanna Rosone,et al. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform , 2012, Bioinform..
[17] Uzi Vishkin,et al. Symmetry breaking for suffix tree construction , 1994, STOC '94.
[18] P Green,et al. Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.
[19] Abraham Lempel,et al. Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.
[20] Kiyoshi Asai,et al. Transformations for the compression of FASTQ quality scores of next-generation sequencing data , 2012, Bioinform..
[21] Joshua M. Stuart,et al. Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. , 2009, The Journal of heredity.
[22] Rangavittal Narayanan,et al. No-Reference Compression of Genomic Data Stored in FASTQ Format , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.
[23] D. Huffman. A Method for the Construction of Minimum-Redundancy Codes , 1952 .
[24] Bradley P. Coe,et al. Genome structural variation discovery and genotyping , 2011, Nature Reviews Genetics.
[25] Giovanni Manzini,et al. Compression boosting in optimal linear time using the Burrows-Wheeler Transform , 2004, SODA '04.
[26] Richard Durbin,et al. Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .
[27] Alfred V. Aho,et al. Efficient string matching , 1975, Commun. ACM.
[28] G. Nolan,et al. Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.
[29] Uzi Vishkin,et al. Efficient approximate and dynamic matching of patterns using a labeling paradigm , 1996, Proceedings of 37th Conference on Foundations of Computer Science.
[30] George Varghese,et al. Compressing Genomic Sequence Fragments Using SlimGene , 2010, RECOMB.
[31] Rasko Leinonen,et al. The sequence read archive: explosive growth of sequencing data , 2011, Nucleic Acids Res..
[32] Meinolf Sellmann,et al. Symmetry Breaking , 2001, CP.