PAT: a protein analysis toolkit for integrated biocomputing on the web

PAT, for Protein Analysis Toolkit, is an integrated biocomputing server. The main goal of its design was to facilitate the combination of different processing tools for complex protein analyses and to simplify the automation of repetitive tasks. The PAT server provides a standardized web interface to a wide range of protein analysis tools. It is designed as a streamlined analysis environment that implements many features which strongly simplify studies dealing with protein sequences and structures and improve productivity. PAT is able to read and write data in many bioinformatics formats and to create any desired pipeline by seamlessly sending the output of a tool to the input of another tool. PAT can retrieve protein entries from identifier-based queries by using pre-computed database indexes. Users can easily formulate complex queries combining different analysis tools with few mouse clicks, or via a dedicated macro language, and a web session manager provides direct access to any temporary file generated during the user session. PAT is freely accessible on the Internet at .

[1]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[2]  Dmitrij Frishman,et al.  STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins , 2004, Nucleic Acids Res..

[3]  S Subramaniam,et al.  The biology workbench—A seamless database and analysis environment for the biologist , 1998, Proteins.

[4]  Sean R. Eddy,et al.  ATV: display and manipulation of annotated phylogenetic , 2001, Bioinform..

[5]  Shandar Ahmad,et al.  NETASA: neural network based prediction of solvent accessibility , 2002, Bioinform..

[6]  Elarbi Badidi,et al.  AnaBench: a Web/CORBA-based workbench for biomolecular sequence analysis , 2003, BMC Bioinformatics.

[7]  Jean-Christophe Gelly,et al.  The KNOTTIN website and database: a new information system dedicated to the knottin scaffold , 2004, Nucleic Acids Res..

[8]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[9]  W R Pearson,et al.  Flexible sequence similarity searching with the FASTA3 program package. , 2000, Methods in molecular biology.

[10]  Dominique Douguet,et al.  Easier threading through web-based comparisons and cross-validations , 2001, Bioinform..

[11]  J. P. Mornon,et al.  Incremental threading optimization (TITO) to help alignment and modelling of remote homologues , 1998, Bioinform..

[12]  Roger A. Sayle,et al.  DSC: public domain protein secondary structure predication , 1997, Comput. Appl. Biosci..

[13]  Jean-Christophe Gelly,et al.  EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments , 2005, BMC Bioinformatics.

[14]  J. Jung,et al.  Protein structure alignment using environmental profiles. , 2000, Protein engineering.

[15]  Guy Perrière,et al.  Integrated databanks access and sequence/structure analysis services at the PBIL , 2003, Nucleic Acids Res..

[16]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[17]  Chris Sander,et al.  MView: a web-compatible database search or multiple alignment viewer , 1998, Bioinform..

[18]  Olivier Gascuel,et al.  Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle , 2002, WABI.

[19]  Catherine Letondal,et al.  A Web interface generator for molecular biology programs in Unix , 2001, Bioinform..

[20]  Malay Kumar Basu,et al.  SeWeR: a customizable and integrated dynamic HTML interface to bioinformatics services , 2001, Bioinform..

[21]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[22]  D. Eisenberg,et al.  Assessment of protein models with three-dimensional profiles , 1992, Nature.

[23]  Jean-Christophe Gelly,et al.  Squash inhibitors: from structural motifs to macrocyclic knottins. , 2004, Current protein & peptide science.

[24]  J Gracy,et al.  Improved alignment of weakly homologous protein sequences using structural information. , 1993, Protein engineering.

[25]  P. Argos,et al.  Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. , 1996, Protein engineering.

[26]  R. Lavery,et al.  A new approach to the rapid determination of protein side chain conformations. , 1991, Journal of biomolecular structure & dynamics.

[27]  Burkhard Rost,et al.  The PredictProtein server , 2003, Nucleic Acids Res..

[28]  Amos Bairoch,et al.  ScanProsite: a reference implementation of a PROSITE scanning tool. , 2002, Applied bioinformatics.

[29]  J. M. Levin,et al.  Exploring the limits of nearest neighbour secondary structure prediction. , 1997, Protein engineering.

[30]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[31]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[32]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[33]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[34]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[35]  Mark D. Wilkinson,et al.  BioMOBY: An Open Source Biological Web Services Proposal , 2002, Briefings Bioinform..