Assembling the SARS-CoV genome -- new method based on graph theoretical approach.

Nowadays, scientists may learn a lot about the organisms studied just by analyzing their genetic material. This requires the development of methods of reading genomes with high accuracy. It has become clear that the knowledge of the changes occurring within a viral genome is indispensable for effective fighting of the pathogen. A good example is SARS-CoV, which was a cause of death of many people and frightened the entire world with its fast and hard to prevent propagation. Rapid development of sequencing methods, like shotgun sequencing or sequencing by hybridization (SBH), gives scientists a good tool for reading genomes. However, since sequencing methods can read fragments of up to 1000 bp only, methods for sequence assembling are required in order to read whole genomes. In this paper a new assembling method, based on graph theoretical approach, is presented. The method was tested on SARS-CoV and the results were compared to the outcome of other widely known methods.

[1]  S. Dover Biology by numbers , 1983, Nature.

[2]  Obi L. Griffith,et al.  The Genome Sequence of the SARS-Associated Coronavirus , 2003, Science.

[3]  Christian Drosten,et al.  Characterization of a Novel Coronavirus Associated with Severe Acute Respiratory Syndrome , 2003, Science.

[4]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[5]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[6]  Petra Mutzel,et al.  Computational Molecular Biology , 1996 .

[7]  《中华放射肿瘤学杂志》编辑部 Medline , 2001, Current Biology.

[8]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Jacek Blazewicz,et al.  Tabu search algorithm for DNA sequencing by hybridization with isothermic libraries , 2004, Comput. Biol. Chem..

[10]  Marek Figlerowicz,et al.  Genetic variability: The key problem in the prevention and therapy of RNA‐based virus infections , 2003, Medicinal research reviews.

[11]  W Bains,et al.  Hybridization methods for DNA sequencing. , 1991, Genomics.

[12]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[13]  P. Pevzner,et al.  Computational Molecular Biology , 2000 .

[14]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[15]  Susan R. Wilson INTRODUCTION TO COMPUTATIONAL BIOLOGY: MAPS, SEQUENCES AND GENOMES. , 1996 .

[16]  Douglas D. Richman,et al.  HIV chemotherapy , 2001, Nature.

[17]  Jacek Blazewicz,et al.  DNA Sequencing With Positive and Negative Errors , 1999, J. Comput. Biol..

[18]  Jacek Blazewicz,et al.  Tabu search for DNA sequencing with false negatives and false positives , 2000, Eur. J. Oper. Res..

[19]  Magnus Malmqvist,et al.  A method for sequencing nucleic acids , 1996 .

[20]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[21]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[22]  Angelo Balestrieri,et al.  Early changes in hepatitis C viral quasispecies during interferon therapy predict the therapeutic outcome , 2002, Proceedings of the National Academy of Sciences of the United States of America.