Computer Manipulation of DNA and Protein Sequences

This unit outlines a variety of methods by which DNA sequences can be manipulated by computers. Procedures for entering sequence data into the computer and assembling raw sequence data into a contiguous sequence are described first, followed by a description of methods of analyzing and manipulating sequences‐‐e.g., verifying sequences, constructing restriction maps, designing oligonucleotides, identifying protein‐coding regions, and predicting secondary structures. This unit also provides information on the large amount of software available for sequence analysis.The appendix to this unit lists some of the commercial software, shareware, and free software related to DNA sequence manipulation. The goal of this unit is to serve as a starting point for researchers interested in utilizing the tremendous sequencing resources available to the computer‐knowledgeable molecular biology laboratory.

[1]  S. Beck Multiplex DNA sequencing. , 1993, Methods in molecular biology.

[2]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[3]  C. E. Longfellow,et al.  Improved parameters for prediction of RNA structure. , 1987, Cold Spring Harbor symposia on quantitative biology.

[4]  S Henikoff,et al.  Performance evaluation of amino acid substitution matrices , 1993, Proteins.

[5]  G D Schuler,et al.  A workbench for multiple alignment construction and analysis , 1991, Proteins.

[6]  Desmond G. Higgins,et al.  Fast and sensitive multiple sequence alignments on a microcomputer , 1989, Comput. Appl. Biosci..

[7]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[8]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[11]  M. Zuker On finding all suboptimal foldings of an RNA molecule. , 1989, Science.

[12]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[13]  J. Fickett Recognition of protein coding regions in DNA sequences. , 1982, Nucleic acids research.

[14]  G. Gonnet,et al.  Exhaustive matching of the entire protein sequence database. , 1992, Science.

[15]  D. Higgins,et al.  See Blockindiscussions, Blockinstats, Blockinand Blockinauthor Blockinprofiles Blockinfor Blockinthis Blockinpublication Clustal: Blockina Blockinpackage Blockinfor Blockinperforming Multiple Blockinsequence Blockinalignment Blockinon Blockina Minicomputer Article Blockin Blockinin Blockin , 2022 .

[16]  D. Turner,et al.  Improved free-energy parameters for predictions of RNA duplex stability. , 1986, Proceedings of the National Academy of Sciences of the United States of America.