The Pfam protein families database

Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgr.ki.se/Pfam/ and in the US at http://pfam.wustl.edu/. The latest version (4.3) of Pfam contains 1815 families. These Pfam families match 63% of proteins in SWISS-PROT 37 and TrEMBL 9. For complete genomes Pfam currently matches up to half of the proteins. Genomic DNA can be directly searched against the Pfam library using the Wise2 package.

[1]  Gunter Schneider,et al.  Determination of Structural Principles Underlying Three Different Modes of Lymphocytic Choriomeningitis Virus Escape from CTL Recognition1 , 2004, The Journal of Immunology.

[2]  Jaap Heringa,et al.  An analysis of protein domain linkers: their classification and role in protein folding. , 2002, Protein engineering.

[3]  Fidel Ramírez,et al.  Functional evaluation of domain-domain interactions and human protein interaction networks , 2007, Bioinform..

[4]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[5]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[6]  Liisa Holm,et al.  RSDB: representative protein sequence databases have high information content , 2000, Bioinform..

[7]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[8]  S. Altuvia,et al.  Identification of an Escherichia coli Operon Required for Formation of the O-Antigen Capsule , 2005, Journal of bacteriology.

[9]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[10]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[11]  Jörg Schultz,et al.  HMM Logos for visualization of protein families , 2004, BMC Bioinformatics.

[12]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[13]  Alex Bateman,et al.  Enhanced protein domain discovery using taxonomy , 2004, BMC Bioinformatics.

[14]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[15]  Jérôme Gouzy,et al.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons , 2000, Nucleic Acids Res..

[16]  G. Fang,et al.  Anillin Is a Substrate of Anaphase-promoting Complex/Cyclosome (APC/C) That Controls Spatial Contractility of Myosin during Late Cytokinesis*[boxs] , 2005, Journal of Biological Chemistry.

[17]  Willy Wriggers,et al.  Control of protein functional dynamics by peptide linkers. , 2005, Biopolymers.

[18]  C. Khosla,et al.  Role of linkers in communication between protein modules. , 2000, Current opinion in chemical biology.

[19]  John C. Wootton,et al.  Non-globular Domains in Protein Sequences: Automated Segmentation Using Complexity Measures , 1994, Comput. Chem..

[20]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[21]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[22]  R. Durbin,et al.  Pfam: A comprehensive database of protein domain families based on seed alignments , 1997, Proteins.

[23]  L. Holm,et al.  Exhaustive enumeration of protein domain families. , 2003, Journal of molecular biology.

[24]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[25]  Alex Bateman,et al.  QuickTree: building huge Neighbour-Joining trees of protein sequences , 2002, Bioinform..

[26]  Sébastien Carrère,et al.  The ProDom database of protein domain families: more emphasis on 3D , 2004, Nucleic Acids Res..

[27]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[28]  Jo Handelsman,et al.  Metagenomics for studying unculturable microorganisms: cutting the Gordian knot , 2005, Genome Biology.

[29]  Zsuzsanna Dosztányi,et al.  IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content , 2005, Bioinform..

[30]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2004, Nucleic acids research.

[31]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[32]  Sean R. Eddy,et al.  The Distributed Annotation System , 2001, BMC Bioinformatics.

[33]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.

[34]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt): an expanding universe of protein information , 2005, Nucleic Acids Res..

[35]  Liisa Holm,et al.  ADDA: a domain database with global coverage of the protein universe , 2004, Nucleic Acids Res..

[36]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.