Protein segment finder: an online search engine for segment motifs in the PDB

Finding related conformations in the Protein Data Bank (PDB) is essential in many areas of bioscience. To assist this task, we designed a search engine that uses a compact database to quickly identify protein segments obeying a set of primary, secondary and tertiary structure constraints. The database contains information such as amino acid sequence, secondary structure, disulfide bonds, hydrogen bonds and atoms in contact as calculated from all protein structures in the PDB. The search engine parses the database and returns hits that match the queried parameters. The conformation search engine, which is notable for its high speed and interactive feedback, is expected to assist scientists in discovering conformation homologs and predicting protein structure. The engine is publicly available at http://ari.stanford.edu/psf and it will also be used in-house in an automatic mode aimed at discovering new protein motifs.

[1]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[2]  Adel Golovin,et al.  MSDmotif: exploring protein sites and motifs , 2008, BMC Bioinformatics.

[3]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[4]  G J Kleywegt,et al.  Recognition of spatial motifs in protein structures. , 1999, Journal of molecular biology.

[5]  Tatsuya Akutsu,et al.  Rapid protein fragment search using hash functions based on the Fourier transform , 1997, Comput. Appl. Biosci..

[6]  Jan Griebsch,et al.  PAST: fast structure-based searching in the PDB , 2006, Nucleic Acids Res..

[7]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[8]  Yoshimasa Takahashi,et al.  SS3D-P2: a three dimensional substructure search program for protein motifs based on secondary structure elements , 1997, Comput. Appl. Biosci..

[9]  Ch. Kiran Kumar,et al.  Fragment Finder: a web-based software to identify similar three-dimensional structural motif , 2005, Nucleic Acids Res..

[10]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[11]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[12]  Michael Levitt,et al.  Growth of novel protein structural data , 2007, Proceedings of the National Academy of Sciences.

[13]  Allegra Via,et al.  pdbFun: mass selection and fast comparison of annotated PDB residues , 2005, Nucleic Acids Res..

[14]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.