DNA Data Storage in Perl

Here we report a simple and flexible method for DNA data storage based on Perl script. For this approach, the text data of the preamble of the “Universal Declaration of Human Rights” consisting of 2,046 words was encoded into the corresponding 8,148 base pairs of DNA using Perl-based encoding with a hash table. The encoded DNA sequences were then artificially synthesized for storage. The information DNA consisted of a total of 22 chemically synthesized DNA fragments with 400 nucleotides each, which were inserted into a cloning vector to multiply the plasmid DNA. The nucleotide integrity of the data-carrying DNA sequences were ensured under the accelerated aging conditions. Also, an erroneous nucleotide in the information DNA sequences was successfully corrected using the overlap extension PCR method. The stored DNA was read by sequencing, and the resulting DNA sequence information was successfully decoded to convert the DNA records back to the original document. Our results indicate that textual data can be stored in DNA using a simple, easy, and flexible Perl by running a script from the command line.

[1]  Jian Ma,et al.  DNA-Based Storage: Trends and Methods , 2015, IEEE Transactions on Molecular, Biological and Multi-Scale Communications.

[2]  Siddhant Shrivastava,et al.  Data Storage in DNA , 2014 .

[3]  Qiaomei Fu,et al.  A mitochondrial genome sequence of a hominin from Sima de los Huesos , 2013, Nature.

[4]  Il-Doo Kim,et al.  Applications and Advances in Bioelectronic Noses for Odour Sensing , 2018, Sensors.

[5]  E. Willerslev,et al.  Geologically ancient DNA: fact or artefact? , 2005, Trends in microbiology.

[6]  Zenon Chaczko,et al.  Review of Big Data Storage Based on DNA Computing , 2015, 2015 Asia-Pacific Conference on Computer Aided System Engineering.

[7]  G. Church,et al.  CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria , 2017, Nature.

[8]  Howard Fosdick Programming languages for library and textual processing , 2006 .

[9]  Masanori Arita,et al.  Secret Signatures Inside Genomic DNA , 2004, Biotechnology progress.

[10]  Giovanni Baiocchi,et al.  Using Perl for Statistics: Data Processing and Statistical Computing , 2004 .

[11]  Chang-Soo Lee,et al.  On-Chip Fluorescence Switching System for Constructing a Rewritable Random Access Data Storage Device , 2018, Scientific Reports.

[12]  L. Ceze,et al.  Molecular digital data storage using DNA , 2019, Nature Reviews Genetics.

[13]  Peter Rice Beginning Perl for Bioinformatics: An Introduction to Perl for Biologists , 2002 .

[14]  Ichiro Matsumura,et al.  Overlap extension PCR cloning: a simple and reliable way to create recombinant plasmids. , 2010, BioTechniques.

[15]  Seungwoo Hwang,et al.  Long-Term Stability and Integrity of Plasmid-Based DNA Data Storage , 2018, Polymers.

[16]  Moonil Kim,et al.  Immobilized Enzymes in Biosensor Applications , 2019, Materials.

[17]  Ewan Birney,et al.  Towards practical, high-capacity, low-maintenance information storage in synthesized DNA , 2013, Nature.

[18]  Luis Ceze,et al.  Demonstration of End-to-End Automation of DNA Data Storage , 2018, Scientific Reports.

[19]  T. Lu,et al.  Genomically encoded analog memory with precise in vivo DNA writing in living cell populations , 2014, Science.

[20]  Manish K. Gupta,et al.  On optimal family of codes for archival DNA storage , 2015, 2015 Seventh International Workshop on Signal Design and its Applications in Communications (IWSDA).

[21]  Janet Kelso,et al.  Nuclear DNA sequences from the Middle Pleistocene Sima de los Huesos hominins , 2016, Nature.

[22]  Wook Park,et al.  High information capacity DNA-based data storage with augmented encoding characters using degenerate bases , 2019, Scientific Reports.

[23]  Yaniv Erlich,et al.  DNA Fountain enables a robust and efficient storage architecture , 2016, Science.

[24]  Robert N Grass,et al.  Robust chemical preservation of digital information on DNA in silica with error-correcting codes. , 2015, Angewandte Chemie.

[25]  Pak Chung Wong,et al.  Organic data memory using the DNA approach , 2003, CACM.

[26]  Catherine Taylor Clelland,et al.  Hiding messages in DNA microdots , 1999, Nature.

[27]  Ting-Fung Chan,et al.  The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module , 2014, Front. Bioeng. Biotechnol..

[28]  G. Church,et al.  Next-Generation Digital Information Storage in DNA , 2012, Science.

[29]  Darshan Panda,et al.  DNA as a digital information storage device: hope or hype? , 2018, 3 Biotech.