Understanding the human genome

At least since Alan Turing tackled Enigma in World War II, building machines to crack codes has been the domain of computer scientists and engineers. Lately they have joined biologists in cracking humanity's most important code-the human genome, the complete set of all our genetic information. Sequencing the human genome is essentially putting in order the over 3 billion chemical units that encode the instructions on how to build and operate a human being. But those instructions are written in a language biology does not fully understand. Indeed, some have described the genome as a parts list minus information on how the parts connect or what they do. And leading scientists are quick to point out that just knowing the raw data set that makes up the genome is not an end in itself. Rather, the usefulness of the genome will emerge only after scientists have figured out how the parts go about making the machine that is the human body. This is what the biology of the new millennium is all about. It is an accelerated science based as much on bits and semiconductor chips as on microscope slides and test tubes. The in silico approach will eclipse in vitro and even in vivo, Francis Collins, director of the National Human Genome Research Institute in Bethesda, Md., predicted. Scientists suspecting a genetic correlation with disease can now seek out starting points in the genes of humans and other creatures, compressing what would have been a decade or more of research into a day or two of database queries. In fact, an industry known as bioinformatics has grown up around the idea that biology will increasingly depend on sorting and manipulating huge amounts of data. Industry analysts forecast that the market for genomics information and the technology to use it will reach an annual US $2 billion by 2005.

[1]  L. Tsui,et al.  Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA. , 1989, Science.

[2]  A. Bairoch PROSITE: a dictionary of sites and patterns in proteins. , 1991, Nucleic acids research.

[3]  V A McKusick,et al.  Current trends in mapping human genes , 1991, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[4]  B. Matthews,et al.  The helix-turn-helix DNA binding motif. , 1989, The Journal of biological chemistry.

[5]  J. Riordan,et al.  Identification of the Cystic Fibrosis Gene : Chromosome Walking and Jumping Author ( s ) : , 2008 .

[6]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[7]  Michael Gribskov,et al.  Profile scanning for three-dimensional structural patterns in protein sequences , 1988, Comput. Appl. Biosci..

[8]  L. Tsui,et al.  Identification of the cystic fibrosis gene: genetic analysis. , 1989, Science.

[9]  Douglas L. Brutlag,et al.  Rapid searches for complex patterns in biological molecules , 1984, Nucleic Acids Res..

[10]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[11]  I. Dodd,et al.  Improved detection of helix-turn-helix DNA-binding motifs in protein sequences. , 1990, Nucleic acids research.

[12]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.

[13]  Wendy L. Kimber,et al.  Cystic fibrosis in the mouse by targeted insertional mutagenesis , 1992, Nature.

[14]  B. Koller,et al.  An Animal Model for Cystic Fibrosis Made by Gene Targeting , 1992, Science.

[15]  R Staden Computer methods to locate signals in nucleic acid sequences , 1984, Nucleic Acids Res..

[16]  T H Murray,et al.  Ethical issues in human genome research , 1991, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[17]  S. P. Fodor,et al.  Light-directed, spatially addressable parallel chemical synthesis. , 1991, Science.

[18]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank. , 1991, Nucleic acids research.

[19]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[20]  F. Collins,et al.  Cystic fibrosis: molecular biology and therapeutic implications. , 1992, Science.

[21]  C Burks,et al.  Electronic data publishing and GenBank. , 1991, Science.

[22]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Richard C. Boucher,et al.  Defective Epithelial Chloride Transport in a Gene-Targeted Mouse Model of Cystic Fibrosis , 1992, Science.

[24]  I. Dodd,et al.  The prediction of helix-turn-helix DNA-binding regions in proteins. A reply to Yudkin. , 1988, Protein engineering.

[25]  Kathryn E. Sidman,et al.  The protein identification resource (PIR). , 1986, Nucleic acids research.

[26]  New nucleotide sequence data on the EMBL File Server. , 1992, Nucleic acids research.

[27]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Douglas L. Brutlag,et al.  Improved sensitivity of biological sequence database searches , 1990, Comput. Appl. Biosci..

[29]  G. H. Hamm,et al.  The EMBL data library , 1993, Nucleic Acids Res..

[30]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[31]  D Benton,et al.  Recent changes in the GenBank On-line Service. , 1990, Nucleic acids research.