TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains

Bacteriocins are proteinaceous toxins produced and exported by both gram-negative and gram-positive bacteria as a defense mechanism. The bacteriocin protein family is highly diverse, which complicates the identification of bacteriocin-like sequences using alignment approaches. The use of topological indices (TIs) irrespective of sequence similarity can be a promising alternative to predict proteinaceous bacteriocins. Thus, we present Topological Indices to BioPolymers (TI2BioP) as an alignment-free approach inspired in both the Topological Substructural Molecular Design (TOPS-MODE) and Markov Chain Invariants for Network Selection and Design (MARCH-INSIDE) methodology. TI2BioP allows the calculation of the spectral moments as simple TIs to seek quantitative sequence-function relationships (QSFR) models. Since hydrophobicity and basicity are major criteria for the bactericide activity of bacteriocins, the spectral moments (HPμk) were derived for the first time from protein artificial secondary structures based on amino acid clustering into a Cartesian system of hydrophobicity and polarity. Several orders of HPμk characterized numerically 196 bacteriocin-like sequences and a control group made up of 200 representative CATH domains. Subsequently, they were used to develop an alignment-free QSFR model allowing a 76.92% discrimination of bacteriocin proteins from other domains, a relevant result considering the high sequence diversity among the members of both groups. The model showed a prediction overall performance of 72.16%, detecting specifically 66.7% of proteinaceous bacteriocins whereas the InterProScan retrieved just 60.2%. As a practical validation, the model also predicted successfully the cryptic bactericide function of the Cry 1Ab C-terminal domain from Bacillus thuringiensis’s endotoxin, which has not been detected by classical alignment methods.

[1]  Mario Soberón,et al.  Cryptic endotoxic nature of Bacillus thuringiensis Cry1Ab insecticidal crystal protein , 2004, FEBS letters.

[2]  T. Stein Bacillus subtilis antibiotics: structures, syntheses and specific functions , 2005, Molecular microbiology.

[3]  Milan Randic,et al.  On the Similarity of DNA Primary Sequences , 2000, J. Chem. Inf. Comput. Sci..

[4]  G. A. de la Riva,et al.  Role of Tryptophan Residues in Toxicity of Cry1Ab Toxin from Bacillus thuringiensis , 2006, Applied and Environmental Microbiology.

[5]  M. Riley,et al.  Genetically engineered bacteriocins and their potential as the next generation of antimicrobials. , 2005, Current pharmaceutical design.

[6]  Riadh Hammami,et al.  BACTIBASE: a new web-accessible database for bacteriocin characterization , 2007, BMC Microbiology.

[7]  M. Soberón,et al.  Structural and functional analysis of the pre-pore and membrane-inserted pore of Cry1Ab toxin. , 2006, Journal of invertebrate pathology.

[8]  M. Lecadet,et al.  Distribution of Clostridial cry-Like Genes Among Bacillus thuringiensis and Clostridium Strains , 1998, Current Microbiology.

[9]  M. Hasegawa,et al.  Novel cry gene from Paenibacillus lentimorbus strain Semadara inhibits ingestion and promotes insecticidal activity in Anomala cuprea larvae. , 2004, Journal of invertebrate pathology.

[10]  Lourdes Santana,et al.  A QSAR model for in silico screening of MAO-A inhibitors. Prediction, synthesis, and biological assay of novel coumarins. , 2006, Journal of medicinal chemistry.

[11]  Yovani Marrero-Ponce,et al.  A linear discrimination analysis based virtual screening of trichomonacidal lead-like compounds: outcomes of in silico studies supported by experimental results. , 2005, Bioorganic & medicinal chemistry letters.

[12]  S. Gill,et al.  Oligomerization triggers binding of a Bacillus thuringiensis Cry1Ab pore-forming toxin to aminopeptidase N receptor leading to insertion into membrane microdomains. , 2004, Biochimica et biophysica acta.

[13]  David H Mathews,et al.  RNA Secondary Structure Analysis Using RNAstructure , 2006, Current protocols in bioinformatics.

[14]  R. P. Ross,et al.  Food microbiology: Bacteriocins: developing innate immunity for food , 2005, Nature Reviews Microbiology.

[15]  Tight-binding "dihedral orbitals" approach to the degree of folding of macromolecular chains. , 2007, The journal of physical chemistry. B.

[16]  K. Wanner,et al.  Methods and Principles in Medicinal Chemistry , 2007 .

[17]  F. Blecha,et al.  Antimicrobial peptides and bacteriocins: alternatives to traditional antibiotics , 2008, Animal Health Research Reviews.

[18]  Ernesto Estrada,et al.  Spectral Moments of the Edge-Adjacency Matrix of Molecular Graphs, 2. Molecules Containing Heteroatoms and QSAR Applications , 1997, J. Chem. Inf. Comput. Sci..

[19]  V. Eijsink,et al.  Mutational analysis of the role of tryptophan residues in an antimicrobial peptide. , 2002, Biochemistry.

[20]  A Nandy Recent investigations into global characteristics of long DNA sequences. , 1994, Indian journal of biochemistry & biophysics.

[21]  Humberto González Díaz,et al.  Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments , 2007, J. Comput. Chem..

[22]  Oscar P. Kuipers,et al.  BAGEL: a web-based bacteriocin genome mining tool , 2006, Nucleic Acids Res..

[23]  Maykel Pérez González,et al.  A topological function based on spectral moments for predicting affinity toward A3 adenosine receptors. , 2006, Bioorganic & medicinal chemistry letters.

[24]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[25]  A. Bravo Phylogenetic relationships of Bacillus thuringiensis delta-endotoxin family proteins and their functional domains , 1997, Journal of bacteriology.

[26]  Humberto González-Díaz,et al.  Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments , 2005, Biopolymers.

[27]  Svetlana Markovic,et al.  Spectral Moments of Phenylenes , 2001, J. Chem. Inf. Comput. Sci..

[28]  Ernesto Estrada,et al.  Spectral Moments of the Edge Adjacency Matrix in Molecular Graphs, 1. Definition and Applications to the Prediction of Physical Properties of Alkanes , 1996, J. Chem. Inf. Comput. Sci..

[29]  O. Sand,et al.  The Bacterial Peptide Pheromone Plantaricin A Permeabilizes Cancerous, but not Normal, Rat Pituitary Cells and Differentiates between the Outer and Inner Membrane Leaflet , 2007, Journal of Membrane Biology.

[30]  Lourdes Santana,et al.  Medicinal chemistry and bioinformatics--current trends in drugs discovery with networks topological indices. , 2007, Current topics in medicinal chemistry.

[31]  K. Marchal,et al.  Peptide signal molecules and bacteriocins in Gram-negative bacteria: a genome-wide in silico screening for peptides containing a double-glycine leader sequence and their cognate transporters , 2004, Peptides.

[32]  S. Gill,et al.  Mode of action of Bacillus thuringiensis Cry and Cyt toxins and their potential for insect control. , 2007, Toxicon : official journal of the International Society on Toxinology.

[33]  Humberto González Díaz,et al.  Computational chemistry comparison of stable/nonstable protein mutants classification models based on 3D and topological indices , 2007, J. Comput. Chem..

[34]  Humberto González Díaz,et al.  Comparative Study of Topological Indices of Macro/Supramolecular RNA Complex Networks , 2008, J. Chem. Inf. Model..

[35]  Rolf Apweiler,et al.  InterProScan: protein domains identifier , 2005, Nucleic Acids Res..

[36]  J. Vederas,et al.  Dynamic relationships among type IIa bacteriocins: temperature effects on antimicrobial activity and on structure of the C-terminal amphipathic alpha helix as a receptor-binding region. , 2004, Biochemistry.

[37]  Lourdes Santana,et al.  Proteomics, networks and connectivity indices , 2008, Proteomics.

[38]  Francisco Torrens,et al.  Atom, atom-type and total molecular linear indices as a promising approach for bioorganic and medicinal chemistry: theoretical and experimental assessment of a novel method for virtual screening and rational design of new lead anthelmintic. , 2005, Bioorganic & medicinal chemistry.

[39]  Humberto González Díaz,et al.  QSAR model for alignment‐free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP‐lattice networks , 2008, J. Comput. Chem..

[40]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[41]  M. Quail,et al.  Complete Sequence and Organization of pBtoxis, the Toxin-Coding Plasmid of Bacillus thuringiensis subsp. israelensis , 2002, Applied and Environmental Microbiology.

[42]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[43]  A. Nandy Two-dimensional graphical representation of DNA sequences and intron-exon discrimination in intron-rich sequences , 1996, Comput. Appl. Biosci..

[44]  Humberto González Díaz,et al.  MMM-QSAR Recognition of Ribonucleases without Alignment: Comparison with an HMM Model and Isolation from Schizosaccharomyces pombe, Prediction, and Experimental Assay of a New Sequence , 2008, J. Chem. Inf. Model..

[45]  Jaap Heringa,et al.  SEQATOMS: a web tool for identifying missing regions in PDB in sequence context , 2008, Nucleic Acids Res..

[46]  A Atkinson,et al.  Using fragment chemistry data mining and probabilistic neural networks in screening chemicals for acute toxicity to the fathead minnow , 2004, SAR and QSAR in environmental research.

[47]  J. Tagg,et al.  What's in a name? Class distinction for bacteriocins , 2006, Nature Reviews Microbiology.

[48]  J. Dorado,et al.  Complex network spectral moments for ATCUN motif DNA cleavage: first predictive study on proteins of human pathogen parasites. , 2009, Journal of proteome research.

[49]  L. Cruz-Chamorro,et al.  In vitro biological activities of magainin alone or in combination with nisin , 2006, Peptides.

[50]  Francisco Torrens,et al.  3D-chiral quadratic indices of the 'molecular pseudograph's atom adjacency matrix' and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. , 2004, Bioorganic & medicinal chemistry.

[51]  Ian Sillitoe,et al.  The CATH classification revisited—architectures reviewed and new ways to characterize structural divergence in superfamilies , 2008, Nucleic Acids Res..

[52]  Humberto González-Díaz,et al.  Predicting stability of Arc repressor mutants with protein stochastic moments. , 2005, Bioorganic & medicinal chemistry.

[53]  Ernesto Estrada,et al.  In Silico Studies toward the Discovery of New Anti-HIV Nucleoside Compounds with the Use of TOPS-MODE and 2D/3D Connectivity Indices, 1. Pyrimidyl Derivatives , 2002, J. Chem. Inf. Comput. Sci..

[54]  Ernesto Estrada,et al.  Tight-binding ‘dihedral orbitals’ approach to electronic communicability in macromolecular chains , 2007, 0905.4099.

[55]  Maykel Pérez González,et al.  A topological sub-structural approach for predicting human intestinal absorption of drugs. , 2004, European journal of medicinal chemistry.

[56]  Humberto González-Díaz,et al.  Alignment-free prediction of polygalacturonases with pseudofolding topological indices: experimental isolation from Coffea arabica and prediction of a new sequence. , 2009, Journal of proteome research.

[57]  Ernesto Estrada,et al.  In Silico Studies toward the Discovery of New Anti-HIV Nucleoside Compounds through the Use of TOPS-MODE and 2D/3D Connectivity Indices. 2. Purine Derivatives , 2005, J. Chem. Inf. Model..

[58]  E Estrada On the Topological Sub-Structural Molecular Design (TOSS-MODE) in QSPR/QSAR and Drug Design Research , 2000, SAR and QSAR in environmental research.

[59]  Saul G. Jacchieri,et al.  Mining combinatorial data in protein sequences and structures , 2004, Molecular Diversity.

[60]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[61]  E Uriarte,et al.  Recent advances on the role of topological indices in drug discovery research. , 2001, Current medicinal chemistry.

[62]  Léon Personnaz,et al.  On Cross Validation for Model Selection , 1999, Neural Computation.