A Software for Prediction of Adhesins and Adhesin-like proteins using Neural networks

Motivation: The adhesion of microbial pathogens to host cells is mediated by adhesins. Experimental methods used for characterizing adhesins are time consuming and demand large resources. The availability of specialized software can rapidly aid experimenters in simplifying this problem. We have employed 105 compositional properties and ANN to develop SPAAN, which predicts the probability of a protein being an adhesin (Pad). Results: SPAAN had optimal sensitivity of 89% and specificity of 100% on a defined test set and could identify 97.4% of known adhesins at high Pad value from a wide range of bacteria. Further, SPAAN guided in improved annotation of several proteins as adhesins. Novel adhesins were identified in 17 pathogenic organisms causing diseases in humans and plants. In the Severe Acute Respiratory Syndrome (SARS) associated human corona virus, the spike glycoprotein and nsp’s (nsp1, nsp5, nsp6 and nsp7) were identified with adhesin-like characteristics. These results offer new leads for rapid experimental testing. Availability: SPAAN is freely available through ftp from either 203.195.151.45 or 203.90.127.75. Retrieve SPAAN.tar.gz Contact: ramu@igib.res.in; ramucbt@yahoo.com at Penylvania State U niersity on Feruary 1, 2013 httpioinform atics.oxjournals.org/ D ow nladed from Introduction Microbial pathogens encode adhesins that mediate their adherence to host cell surface receptors, membranes, or extracellular matrix for successful colonization. Investigations in this primary event of host-pathogen interaction have revealed a wide array of adhesins in a variety of pathogenic microbes (Finlay and Falkow, 1997). New approaches to vaccine development focus on targeting adhesins to abrogate the colonization process (Wizemann et al. 1999). However, the specific roles of particular adhesins in several pathogens remain to be elucidated. One of the best-understood mechanisms of bacterial adherence is attachment mediated by pili or fimbriae. The well-studied adhesins in this category are FimH and PapG adhesins of Escherichia coli (Hahn et al., 2002) and the type IV pili adhesins in Pseudomonas aeruginosa, Neisseria, Moraxella, enteropathogenic Escherichia coli and Vibrio cholerae (Strom and Lory, 1993). Several adhesins from other commonly known bacterial pathogens include MrkD protein of Kleibsella pneumoniae (Gerlach et al. 1989), Hia of H. influenzae (Barenkamp and St Geme1996), and many others (see http://www.igib.res.in/data/seepath/spaan_data.html for details). Several vaccine formulations either currently approved or being evaluated, use adhesins as immunizing agents. Examples include filamentous hemagglutinin and pertactin proteins against B. pertussis (Halperin et al. 2003), FimH against pathogenic E. coli (Langermann et al. 2000), PsaA against pneumococcal disease (Rapola et al. 2003), outer membrane vesicle preparations including BabA adhesin against H. pylori infections (Prinz et al. 2003) and a synthetic peptide anti-adhesin vaccine against Pseudomonas aeruginosa infections (Cachia and Hodges 2003). Experimental identification of adhesins is an arduous task. Computational methods such as homology search can aid but this procedure suffers from limitations when the homologues are not characterized. Sequence analysis based on compositional properties provides relief from this problem. Amino acid composition is a fundamental attribute of a protein and it has significant correlation to its location, function, folding type, shape and in vivo stability (Nakashima and Nishikawa 1994; Nandi, T. et al. 2003). Recently, compositional properties have been applied to problems as diverse as prediction of functional roles (Hobohm and Sander 1995), protein secondary structures at Penylvania State U niersity on Feruary 1, 2013 httpioinform atics.oxjournals.org/ D ow nladed from (Rost and Sander 1993), secretory proteins and apicoplast targeted proteins in Plasmodium falciparum (Schneider 1999; Zuegge, et al. 2001). We report a non-homology method using 105 compositional properties combined with artificial neural networks (ANN) to identify adhesins and adhesin-like proteins in species belonging to wide phylogenetic spectrum. Systems and Methods

[1]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[2]  Theresa M. Wizemann,et al.  Adhesins as targets for vaccine development. , 1999, Emerging infectious diseases.

[3]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[4]  R. Hodges,et al.  Synthetic peptide vaccine and antibody therapeutic development: Prevention and treatment of Pseudomonas aeruginosa , 2003, Biopolymers.

[5]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[6]  S. Clegg,et al.  Identification and characterization of the genes encoding the type 3 and type 1 fimbrial adhesins of Klebsiella pneumoniae , 1989, Journal of bacteriology.

[7]  William R. Jacobs,et al.  Evidence that Mycobacterial PE_PGRS Proteins Are Cell Surface Constituents That Influence Interactions with Other Cells , 2001, Infection and Immunity.

[8]  S. Lory,et al.  Structure-function and biogenesis of the type IV pili. , 1993, Annual review of microbiology.

[9]  T. A. Andrea,et al.  Applications of neural networks in quantitative structure-activity relationships of dihydrofolate reductase inhibitors. , 1991, Journal of medicinal chemistry.

[10]  M. Brennan,et al.  Pertussis antigens that abrogate bacterial adherence and elicit immunity. , 1996, American journal of respiratory and critical care medicine.

[11]  Mervi Eerola,et al.  Anti-PsaA and the risk of pneumococcal AOM and carriage. , 2003, Vaccine.

[12]  B. Rost,et al.  Improved prediction of protein secondary structure by use of sequence profiles and neural networks. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[13]  S Karlin,et al.  Methods and algorithms for statistical analysis of protein sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[14]  G. Schneider,et al.  How many potentially secreted proteins are contained in a bacterial genome? , 1999, Gene.

[15]  Debasis Dash,et al.  A Novel Complexity Measure for Comparative Analysis of Protein Sequences from Complete Genomes , 2003, Journal of biomolecular structure & dynamics.

[16]  B. Finlay,et al.  A pathogenic bacterium triggers epithelial signals to form a functional bacterial receptor that mediates actin pseudopod formation. , 1996, The EMBO journal.

[17]  R Möllby,et al.  Vaccination with FimH adhesin protects cynomolgus monkeys from colonization and infection by uropathogenic Escherichia coli. , 2000, The Journal of infectious diseases.

[18]  Bruce Smith,et al.  Nature, evolution, and appraisal of adverse events and antibody response associated with the fifth consecutive dose of a five-component acellular pertussis-based combination vaccine. , 2003, Vaccine.

[19]  M. Buchmeier,et al.  Coronavirus Spike Proteins in Viral Entry and Pathogenesis , 2001, Virology.

[20]  B. Berger,et al.  betawrap: Successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  U. Hobohm,et al.  A sequence property approach to searching protein databases. , 1995, Journal of molecular biology.

[22]  M. Breimer,et al.  Specificity of binding of a strain of uropathogenic Escherichia coli to Gal alpha 1----4Gal-containing glycosphingolipids. , 1985, The Journal of biological chemistry.

[23]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[24]  L Zhang,et al.  Discovery of disseminated J96-like strains of uropathogenic Escherichia coli O4:H5 containing genes for both PapG(J96) (class I) and PrsG(J96) (class III) Gal(alpha1-4)Gal-binding adhesins. , 1997, The Journal of infectious diseases.

[25]  Christian Prinz,et al.  Helicobacter pylori virulence factors and the host immune response: implications for therapeutic vaccination. , 2003, Trends in microbiology.

[26]  J Zuegge,et al.  Deciphering apicoplast targeting signals--feature extraction from nuclear-encoded precursors of Plasmodium falciparum apicoplast proteins. , 2001, Gene.

[27]  Ueli Aebi,et al.  Exploring the 3D molecular architecture of Escherichia coli type 1 pili. , 2002, Journal of molecular biology.

[28]  J. W. Geme,et al.  Progress towards a vaccine for nontypable Haemophilus influenzae. , 1996 .

[29]  J. W. Geme,et al.  Identification of a second family of high‐molecular‐weight adhesion proteins expressed by non‐typable Haemophilus influenzae , 1996, Molecular microbiology.

[30]  S Falkow,et al.  Copyright © 1997, American Society for Microbiology Common Themes in Microbial Pathogenicity Revisited , 2022 .

[31]  Benjamin A. Shoemaker,et al.  CDD: a database of conserved domain alignments with links to domain three-dimensional structure , 2002, Nucleic Acids Res..

[32]  F. Emmrich,et al.  Outer membrane protein YadA of enteropathogenic yersiniae mediates specific binding to cellular but not plasma fibronectin , 1993, Infection and immunity.