Edit metric decoding: Representation strikes back

Quaternary error-correcting codes defined over the edit metric may be used as labels to track the origin of sequence data. When used in such applications there are typically additional restrictions that are biologically motivated, such as a required GC content or the avoidance of certain patterns. As a result such codes can not be expected to have a regular structure, making decoding particularly challenging. Previous work on decoding edit codes considered the use of side effect machines for decoding, successfully decoding up to 93.86% of error vectors. In this study the recentering/restarting algorithm is used in combination with side effect machines and an alternative representation based upon transpositions. Using the same data as in the previous work, the rate of successful decoding was significantly improved, with many cases obtaining rates very close to 100%.

[1]  Sheridan K. Houghten,et al.  Recentering, reanchoring & restarting an evolutionary algorithm , 2013, 2013 World Congress on Nature and Biologically Inspired Computing.

[2]  Sheridan K. Houghten,et al.  DNA error correcting codes: No crossover. , 2009, 2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[3]  M.C. Davey,et al.  Watermark codes: reliable communication over insertion/deletion channels , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[4]  Sheridan K. Houghten,et al.  Optimizing the Salmon Algorithm for the construction of DNA error-correcting codes , 2011, 2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[5]  Sheridan K. Houghten,et al.  Edit metric decoding: a new hope , 2009, C3S2E '09.

[6]  Sheridan K. Houghten,et al.  Side effect machines for quaternary edit metric decoding , 2010, 2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[7]  Daniel A. Ashlock,et al.  Classifying synthetic and biological DNA sequences with side effect machines , 2008, 2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[8]  Daniel A. Ashlock,et al.  Characterization of extremal epidemic networks with diffusion characters , 2008, 2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[9]  Suprakash Datta,et al.  Evolved Features for DNA Sequence Classification and Their Fitness Landscapes , 2013, IEEE Transactions on Evolutionary Computation.

[10]  David J. C. MacKay,et al.  Codes for Channels with Insertions, Deletions and Substitutions , 2000 .

[11]  Suprakash Datta,et al.  Distinguishing Endogenous Retroviral LTRs from SINE Elements Using Features Extracted from Evolved Side Effect Machines , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  D. Ashlock,et al.  Side effect machines for sequence classification , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.

[13]  N. J. A. Sloane,et al.  Lexicographic codes: Error-correcting codes from game theory , 1986, IEEE Trans. Inf. Theory.

[14]  Suprakash Datta,et al.  Detecting retroviruses using reading frame information and side effect machines , 2010, 2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[15]  Sheridan K. Houghten,et al.  On the synthesis of DNA error correcting codes , 2012, Biosyst..

[16]  Daniel A. Ashlock,et al.  Comparison of evolved epidemic networks with diffusion characters , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[17]  Joseph Alexander Brown,et al.  Decoding algorithms using side-effect machines , 2010 .

[18]  Daniel A. Ashlock,et al.  Classifying Cytochrome c Oxidase subunit 1 by translation initiation mechanism using side effect machines , 2010, 2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[19]  Daniel A. Ashlock,et al.  Fitting contact networks to epidemic behavior with an evolutionary algorithm , 2011, 2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[20]  Sheridan K. Houghten,et al.  A Novel Variation Operator for More Rapid Evolution of DNA Error Correcting Codes. , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.