Blocks‐based methods for detecting protein homology

The most highly conserved regions of proteins can be represented as blocks of aligned sequence segments, typically with multiple blocks for a given protein family. The Blocks Database World Wide Web (http://blocks.fhcrc.org) and e‐mail (blocks@blocks. fhcrc.org) servers provide tools to search DNA and protein queries against the Blocks+ Database of multiple alignments. We describe features for detection of distant relationships using blocks. Blocks+ includes protein families from the PROSITE, Prints, Pfam‐A, ProDom and Domo databases. Other features include searching Blocks+ with the BLIMPS and NCBI's IMPALA programs, sequence logos, phylogenetic trees, three‐dimensional display of blocks on PDB structures, and a polymerase chain reaction (PCR) primer design strategy based on blocks.

[1]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[2]  R J Roberts,et al.  Predictive motifs derived from cytosine methyltransferases. , 1989, Nucleic acids research.

[3]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[4]  Hamilton O. Smith,et al.  Finding sequence motifs in groups of functionally related proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[5]  S. Henikoff,et al.  Automated assembly of protein blocks for database searching. , 1991, Nucleic acids research.

[6]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank. , 1991, Nucleic acids research.

[8]  AC Tose Cell , 1993, Cell.

[9]  S. Altschul,et al.  Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[10]  S. Henikoff,et al.  Position-based sequence weights. , 1994, Journal of molecular biology.

[11]  Jun S. Liu,et al.  Gibbs motif sampling: Detection of bacterial outer membrane protein repeats , 1995, Protein science : a publication of the Protein Society.

[12]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[13]  S. Henikoff,et al.  Automated construction and graphical presentation of protein blocks from unaligned sequences. , 1995, Gene.

[14]  S. Pietrokovski Searching databases of conserved sequence regions by aligning protein multiple-alignments. , 1996, Nucleic acids research.

[15]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[16]  T. Sixma,et al.  Crystal structure of the specific DNA‐binding domain of Tc3 transposase of C.elegans in complex with transposon DNA , 1997, The EMBO journal.

[17]  Michael Gribskov,et al.  Score Distributions for Simultaneous Matching to Multiple Motifs , 1997, J. Comput. Biol..

[18]  Shmuel Pietrokovski,et al.  Recent enhancements to the Blocks Database servers , 1997, Nucleic Acids Res..

[19]  S. Henikoff,et al.  Embedding strategies for effective use of information from multiple sequence alignments , 1997, Protein science : a publication of the Protein Society.

[20]  I. Mian,et al.  A Z-DNA binding domain present in the human editing enzyme, double-stranded RNA adenosine deaminase. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[21]  S. Henikoff,et al.  A helix-turn-helix DNA-binding motif predicted for transposases of DNA transposons , 1997, Molecular and General Genetics MGG.

[22]  E. Koonin,et al.  Crystal Structure of a Hedgehog Autoprocessing Domain: Homology between Hedgehog and Self-Splicing Proteins , 1997, Cell.

[23]  Jérôme Gracy,et al.  Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment , 1998, Bioinform..

[24]  S. Pietrokovski,et al.  Modular organization of inteins and C‐terminal autocatalytic domains , 1998, Protein science : a publication of the Protein Society.

[25]  S. Henikoff,et al.  Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly related sequences. , 1998, Nucleic acids research.

[26]  S. Henikoff,et al.  A DNA methyltransferase homolog with a chromodomain exists in multiple polymorphic forms in Arabidopsis. , 1998, Genetics.

[27]  H. Wang,et al.  Pogo transposase contains a putative helix-turn-helix DNA binding domain that recognises a 12 bp sequence within the terminal inverted repeats. , 1999, Nucleic acids research.

[28]  Robert D. Finn,et al.  Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins , 1999, Nucleic Acids Res..

[29]  Shmuel Pietrokovski,et al.  New features of the Blocks Database servers , 1999, Nucleic Acids Res..

[30]  Terri K. Attwood,et al.  PRINTS prepares for the new millennium , 1999, Nucleic Acids Res..

[31]  Amos Bairoch,et al.  The PROSITE database, its status in 1999 , 1999, Nucleic Acids Res..

[32]  Jérôme Gouzy,et al.  Recent improvements of the ProDom database of protein domain families , 1999, Nucleic Acids Res..

[33]  Shmuel Pietrokovski,et al.  Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations , 1999, Bioinform..

[34]  A. Rich,et al.  Crystal structure of the Zalpha domain of the human editing enzyme ADAR1 bound to left-handed Z-DNA. , 1999, Science.

[35]  Alejandro A. Schäffer,et al.  IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices , 1999, Bioinform..

[36]  Steven Henikoff,et al.  Targeted screening for induced mutations , 2000, Nature Biotechnology.

[37]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .