Soft-Decision Decoding for DNA-Based Data Storage

This paper presents novel soft-decision decoding (SDD) of error correction codes (ECCs) that substantially improve the reliability of DNA-based data storage system compared with conventional hard-decision decoding (HDD). We propose a simplified system model for DNA-based data storage according to the major characteristics and different types of errors associated with the prevailing DNA synthesis and sequencing technologies. We compute analytically the error-free probability of each sequenced DNA oligonucleotide (oligo), based on which the soft-decision log-likelihood ratio (LLR) of each oligo can be derived. We apply the proposed SDD algorithms to the recently proposed DNA Fountain scheme. Simulation results show that SDD achieves an error rate improvement of two to three orders of magnitude over HDD, thus demonstrating its potential to improve the information density of DNA-based data storage systems.