An account of in silico identification tools of secreted effector proteins in bacteria and future challenges

Abstract Bacterial pathogens secrete numerous effector proteins via six secretion systems, type I to type VI secretion systems, to adapt to new environments or to promote virulence by bacterium‐host interactions. Many computational approaches have been used in the identification of effector proteins before the subsequent experimental verification because they tolerate laborious biological procedures and are genome scale, automated and highly efficient. Prevalent examples include machine learning methods and statistical techniques. In this article, we summarize the computational progress toward predicting secreted effector proteins in bacteria, with an opening of an introduction of features that are used to discriminate effectors from non‐effectors. The mechanism, contribution and deficiency of previous developed detection tools are presented, which are further benchmarked based on a curated testing data set. According to the results of benchmarking, potential improvements of the prediction performance are discussed, which include (1) more informative features for discriminating the effectors from non‐effectors; (2) the construction of comprehensive training data set of the machine learning algorithms; (3) the advancement of reliable prediction methods and (4) a better interpretation of the mechanisms behind the molecular processes. The future of in silico identification of bacterial secreted effectors includes both opportunities and challenges.

[1]  A. Records The type VI secretion system: a multipurpose delivery system with a phage-like machinery. , 2011, Molecular plant-microbe interactions : MPMI.

[2]  Yu-Yen Ou,et al.  Prediction of transporter targets using efficient RBF networks with PSSM profiles and biochemical properties , 2011, Bioinform..

[3]  E. Cascales,et al.  Structure and regulation of the type VI secretion system. , 2012, Annual review of microbiology.

[4]  J. E. Koehler,et al.  Recommendations for Treatment of Human Infections Caused by Bartonella Species , 2004, Antimicrobial Agents and Chemotherapy.

[5]  Gisbert Schneider,et al.  Prediction of Type III Secretion Signals in Genomes of Gram-Negative Bacteria , 2009, PloS one.

[6]  M. Pallen,et al.  Genomic analysis of secretion systems. , 2003, Current opinion in microbiology.

[7]  B. Vinatzer,et al.  Bioinformatics correctly identifies many type III secretion substrates in the plant pathogen Pseudomonas syringae and the biocontrol isolate P. fluorescens SBW25. , 2005, Molecular plant-microbe interactions : MPMI.

[8]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[9]  C. Hueck,et al.  Type III Protein Secretion Systems in Bacterial Pathogens of Animals and Plants , 1998, Microbiology and Molecular Biology Reviews.

[10]  I. Henderson,et al.  Type V Protein Secretion Pathway: the Autotransporter Story , 2004, Microbiology and Molecular Biology Reviews.

[11]  Dong Xu,et al.  Computational Identification of Protein Methylation Sites through Bi-Profile Bayes Feature Extraction , 2009, PloS one.

[12]  Ziding Zhang,et al.  BEAN 2.0: an integrated web resource for the identification and functional analysis of type III secreted effectors , 2015, Database J. Biol. Databases Curation.

[13]  Hans Wolf-Watz,et al.  Protein delivery into eukaryotic cells by type III secretion machines , 2006, Nature.

[14]  Aleksey A. Porollo,et al.  Combining prediction of secondary structure and solvent accessibility in proteins , 2005, Proteins.

[15]  Yejun Wang,et al.  Effective Identification of Bacterial Type III Secretion Signals Using Joint Element Features , 2013, PloS one.

[16]  E. Cascales,et al.  Definition of a Bacterial Type IV Secretion Pathway for a DNA Substrate , 2004, Science.

[17]  Thomas Rattei,et al.  Effective—a database of predicted secreted bacterial proteins , 2010, Nucleic Acids Res..

[18]  Hans Wolf-Watz,et al.  Molecular characterization of type III secretion signals via analysis of synthetic N‐terminal amino acid sequences , 2002, Molecular microbiology.

[19]  D. Maskell,et al.  Multiple Roles for BordetellaLipopolysaccharide Molecules during Respiratory Tract Infection , 2000, Infection and Immunity.

[20]  William Stafford Noble,et al.  Support vector machine classification on the web , 2004, Bioinform..

[21]  Shira L. Broschat,et al.  Identification of Anaplasma marginale Type IV Secretion System Effector Proteins , 2011, PloS one.

[22]  E. Cascales,et al.  Biogenesis, architecture, and function of bacterial type IV secretion systems. , 2005, Annual review of microbiology.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  R. Isberg,et al.  Conjugative transfer by the virulence system of Legionella pneumophila. , 1998, Science.

[25]  J. Galán,et al.  Chaperone release and unfolding of substrates in type III secretion , 2005, Nature.

[26]  Fan Zhang,et al.  T3SEdb: data warehousing of virulence effectors secreted by the bacterial Type III Secretion System , 2010, BMC Bioinformatics.

[27]  B. Coburn,et al.  Salmonella, the host and disease: a brief review , 2007, Immunology and cell biology.

[28]  I A Basheer,et al.  Artificial neural networks: fundamentals, computing, design, and application. , 2000, Journal of microbiological methods.

[29]  S. B. Peterson,et al.  Type VI secretion system effectors: poisons with a purpose , 2014, Nature Reviews Microbiology.

[30]  William Stafford Noble,et al.  Support vector machine , 2013 .

[31]  Alan Collmer,et al.  Pseudomonas syringae Type III Secretion System Targeting Signals and Novel Effectors Studied with a Cya Translocation Reporter , 2004, Journal of bacteriology.

[32]  L. M. Schechter,et al.  Multiple approaches to a complete inventory of Pseudomonas syringae pv. tomato DC3000 type III secretion system effector proteins. , 2006, Molecular plant-microbe interactions : MPMI.

[33]  J. Gorvel,et al.  In search of Brucella abortus type IV secretion substrates: screening and identification of four proteins translocated into host cells through VirB system , 2011, Cellular microbiology.

[34]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[35]  Hiroki Nagai,et al.  A C-terminal translocation signal required for Dot/Icm-dependent delivery of the Legionella RalF protein to host cells. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Jeff H. Chang,et al.  A high-throughput, near-saturating screen for type III effector genes from Pseudomonas syringae. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[37]  T. Zusman,et al.  The Icm/Dot type-IV secretion systems of Legionella pneumophila and Coxiella burnetii. , 2005, FEMS microbiology reviews.

[38]  Ram Samudrala,et al.  Accurate Prediction of Secreted Substrates and Identification of a Conserved Putative Secretion Signal for Type III Secretion Systems , 2009, PLoS pathogens.

[39]  Tal Pupko,et al.  Genome-Scale Identification of Legionella pneumophila Effectors Using a Machine Learning Approach , 2009, PLoS pathogens.

[40]  Monica Vencato,et al.  Bioinformatics-enabled identification of the HrpL regulon and type III secretion system effector proteins of Pseudomonas syringae pv. phaseolicola 1448A. , 2006, Molecular plant-microbe interactions : MPMI.

[41]  J. Galán,et al.  Type III Secretion Machines: Bacterial Devices for Protein Delivery into Host Cells , 1999 .

[42]  G. Segal,et al.  Identification of legionella effectors using bioinformatic approaches. , 2013, Methods in molecular biology.

[43]  Lingyun Zou,et al.  Accurate prediction of bacterial type IV secreted effectors using amino acid composition and PSSM profiles , 2013, Bioinform..

[44]  Xavier Daura,et al.  A Functional Screen for the Type III (Hrp) Secretome of the Plant Pathogen Pseudomonas syringae , 2002 .

[45]  Yufeng Yao,et al.  SecReT6: a web-based resource for type VI secretion systems found in bacteria. , 2015, Environmental microbiology.

[46]  Qing Zhang,et al.  T3DB: an integrated database for bacterial type III secretion system , 2012, BMC Bioinformatics.

[47]  S. Coulthurst,et al.  Molecular weaponry: diverse effectors delivered by the Type VI secretion system , 2015, Cellular microbiology.

[48]  M. Norman,et al.  Yersinia YopE is targeted for type III secretion by N‐terminal, not mRNA, signals , 2001, Molecular microbiology.

[49]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[50]  Pedro Manuel Martínez-García,et al.  T346Hunter: A Novel Web-Based Tool for the Prediction of Type III, Type IV and Type VI Secretion Systems in Bacterial Genomes , 2015, PloS one.

[51]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[52]  A. Driessen,et al.  Sec- and Tat-mediated protein secretion across the bacterial cytoplasmic membrane--distinct translocases and mechanisms. , 2008, Biochimica et biophysica acta.

[53]  Xiaoye Liang,et al.  Identification of divergent type VI secretion effectors using a conserved chaperone domain , 2015, Proceedings of the National Academy of Sciences.

[54]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[55]  J. Mekalanos,et al.  PAAR-repeat proteins sharpen and diversify the Type VI secretion system spike , 2013, Nature.

[56]  P. Delepelaire Type I secretion in gram-negative bacteria. , 2004, Biochimica et biophysica acta.

[57]  M Simone,et al.  The carboxy‐terminus of VirE2 from Agrobacterium tumefaciens is required for its transport to host cells by the virB‐encoded type IV transport system , 2001, Molecular microbiology.

[58]  Jorge E. Galán,et al.  Maintenance of an unfolded polypeptide by a cognate chaperone in bacterial type III secretion , 2001, Nature.

[59]  M. Trost,et al.  VgrG and PAAR Proteins Define Distinct Versions of a Functional Type VI Secretion System , 2016, PLoS pathogens.

[60]  S. Coulthurst,et al.  The Type VI secretion system - a widespread and versatile cell targeting system. , 2013, Research in microbiology.

[61]  Christopher M. Bailey,et al.  Bioinformatics analysis of the locus for enterocyte effacement provides novel insights into type-III secretion. , 2005, BMC microbiology.

[62]  Christopher M. Bailey,et al.  Bioinformatics, genomics and evolution of non-flagellar type-III secretion systems: a Darwinian perspective. , 2005, FEMS microbiology reviews.

[63]  Geoffrey I. Webb,et al.  Comprehensive assessment and performance improvement of effector protein predictors for bacterial secretion systems III, IV and VI , 2016, Briefings Bioinform..

[64]  C. Hoogland,et al.  In The Proteomics Protocols Handbook , 2005 .

[65]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[66]  Katharina Trunk,et al.  The Opportunistic Pathogen Serratia marcescens Utilizes Type VI Secretion To Target Bacterial Competitors , 2011, Journal of bacteriology.

[67]  Mario Juhas,et al.  Type IV secretion systems: tools of bacterial horizontal gene transfer and virulence , 2008, Cellular microbiology.

[68]  Michel Hébraud,et al.  Secretion and subcellular localizations of bacterial proteins: a semantic awareness issue. , 2009, Trends in microbiology.

[69]  G. Cornelis The Yersinia Ysc–Yop 'Type III' weaponry , 2002, Nature Reviews Molecular Cell Biology.

[70]  Alain Filloux,et al.  A nascent and modular repertoire of T 6 SS effectors , 2015 .

[71]  Alan Collmer,et al.  Genomewide identification of proteins secreted by the Hrp type III protein secretion system of Pseudomonas syringae pv. tomato DC3000 , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[72]  G. Martin,et al.  Genomewide identification of Pseudomonas syringae pv. tomato DC3000 promoters controlled by the HrpL alternative sigma factor , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[73]  Zili Zhang,et al.  A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data , 2010, BMC Bioinformatics.

[74]  Lisa N Kinch,et al.  Marker for type VI secretion system effectors , 2014, Proceedings of the National Academy of Sciences.

[75]  C. Parsot,et al.  Chaperones of the type III secretion pathway: jacks of all trades , 2002, Molecular microbiology.

[76]  J. Dangl,et al.  Diverse Evolutionary Mechanisms Shape the Type III Effector Virulence Factor Repertoire in the Plant Pathogen Pseudomonas syringae , 2004, Genetics.

[77]  J. Mekalanos,et al.  Translocation of a Vibrio cholerae type VI secretion effector requires bacterial endocytosis by host cells. , 2009, Cell host & microbe.

[78]  Alan Collmer,et al.  Genomic mining type III secretion system effectors in Pseudomonas syringae yields new picks for all TTSS prospectors. , 2002, Trends in microbiology.

[79]  Tetsuya Hayashi,et al.  An extensive repertoire of type III secretion effectors in Escherichia coli O157 and the role of lambdoid phages in their dissemination , 2006, Proceedings of the National Academy of Sciences.

[80]  John M. Walker,et al.  The Proteomics Protocols Handbook , 2005, Humana Press.

[81]  E. Cascales,et al.  The versatile bacterial type IV secretion systems , 2003, Nature Reviews Microbiology.

[82]  John Stavrinides,et al.  Host–pathogen interplay and the evolution of bacterial effectors , 2007, Cellular microbiology.

[83]  Ian H. Witten,et al.  WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[84]  Yejun Wang,et al.  Prediction of bacterial type IV secreted effectors by C-terminal features , 2014, BMC Genomics.

[85]  Rangel C. Souza,et al.  AtlasT4SS: A curated database for type IV secretion systems , 2012, BMC Microbiology.

[86]  Qing Zhang,et al.  High-accuracy prediction of bacterial type III secreted effectors based on position-specific amino acid composition profiles , 2011, Bioinform..

[87]  Fred Heffron,et al.  Identification of New Secreted Effectors in Salmonella enterica Serovar Typhimurium , 2005, Infection and Immunity.

[88]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[89]  Thomas Nussbaumer,et al.  EffectiveDB—updates and novel features for a better annotation of bacterial secreted proteins and Type III, IV, VI secretion systems , 2015, Nucleic Acids Res..

[90]  Christopher M. Bailey,et al.  Type VI secretion: a beginner's guide. , 2008, Current opinion in microbiology.

[92]  Humira Sonah,et al.  Computational Prediction of Effector Proteins in Fungi: Opportunities and Challenges , 2016, Front. Plant Sci..

[93]  Menglong Li,et al.  Effective Identification of Gram-Negative Bacterial Type III Secreted Effectors Using Position-Specific Residue Conservation Profiles , 2013, PloS one.

[94]  P. Ghosh Process of Protein Transport by the Type III Secretion System , 2004, Microbiology and Molecular Biology Reviews.

[95]  Christoph Dehio,et al.  A bipartite signal mediates the transfer of type IV secretion substrates of Bartonella henselae into human cells. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[96]  Seema Mattoo,et al.  A genome‐wide screen identifies a Bordetella type III secretion effector and candidate effectors in other species , 2005, Molecular microbiology.

[97]  Dong Xu,et al.  Effector prediction in host-pathogen interaction based on a Markov model of a ubiquitous EPIYA motif , 2010, BMC Genomics.

[98]  Zixin Deng,et al.  SecReT4: a web-based bacterial type IV secretion system resource , 2012, Nucleic Acids Res..

[99]  D. Goodlett,et al.  A type VI secretion system of Pseudomonas aeruginosa targets a toxin to bacteria. , 2010, Cell host & microbe.

[100]  Zhao-Qing Luo,et al.  The E Block motif is associated with Legionella pneumophila translocated substrates , 2010, Cellular microbiology.

[101]  Yejun Wang,et al.  T3_MM: A Markov Model Effectively Classifies Bacterial Type III Secretion Signals , 2013, PloS one.

[102]  Ziding Zhang,et al.  Using Weakly Conserved Motifs Hidden in Secretion Signals to Identify Type-III Effectors from Bacterial Pathogen Genomes , 2013, PloS one.

[103]  Zhao-Qing Luo,et al.  Large-scale identification and translocation of type IV secretion substrates by Coxiella burnetii , 2010, Proceedings of the National Academy of Sciences.

[104]  João C Setubal,et al.  Protein secretion systems in bacterial-host associations, and their description in the Gene Ontology , 2009, BMC Microbiology.

[105]  E. Larson,et al.  Structural Basis for Type VI Secretion Effector Recognition by a Cognate Immunity Protein , 2012, PLoS pathogens.

[106]  Steven Johnson Rob Mitra Tim Schedl Jim Skeath Gar Stormo,et al.  REMOTE PROTEIN HOMOLOGY DETECTION USING HIDDEN MARKOV MODELS , 2006 .

[107]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[108]  Samuel I. Miller,et al.  Salmonellae interplay with host cells , 2008, Nature Reviews Microbiology.

[109]  Christoph Dehio,et al.  Bacterial type IV secretion systems in human disease , 2009, Molecular microbiology.

[110]  Hiroki Nagai,et al.  Legionella translocates an E3 ubiquitin ligase that has multiple U‐boxes with distinct functions , 2008, Molecular microbiology.

[111]  Q. Jin,et al.  A Pseudomonas aeruginosa type VI secretion phospholipase D effector targets both prokaryotic and eukaryotic cells. , 2014, Cell host & microbe.

[112]  Mark Beale,et al.  Neural Network Toolbox™ User's Guide , 2015 .

[113]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[114]  Sean R Eddy,et al.  A new generation of homology search tools based on probabilistic inference. , 2009, Genome informatics. International Conference on Genome Informatics.

[115]  Yoshiharu Sato,et al.  Meta-analytic approach to the accurate prediction of secreted virulence effectors in gram-negative bacteria , 2011, BMC Bioinformatics.

[116]  A. Vergunst,et al.  Positive charge is an important feature of the C-terminal transport signal of the VirB/D4-translocated proteins of Agrobacterium. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[117]  S. Hultgren,et al.  Multiple pathways allow protein secretion across the bacterial outer membrane. , 2000, Current opinion in cell biology.

[118]  Tal Pupko,et al.  Computational modeling and experimental validation of the Legionella and Coxiella virulence-related type-IVB secretion signal , 2013, Proceedings of the National Academy of Sciences.

[119]  Thomas Rattei,et al.  Sequence-Based Prediction of Type III Secreted Proteins , 2009, PLoS pathogens.

[120]  R. Munson,et al.  Acinetobacter baumannii Utilizes a Type VI Secretion System for Bacterial Competition , 2013, PloS one.

[121]  P. Christie Type IV secretion: the Agrobacterium VirB/D4 and related conjugation systems. , 2004, Biochimica et biophysica acta.

[122]  Ram Samudrala,et al.  Computational Prediction of Type III and IV Secreted Effectors in Gram-Negative Bacteria , 2010, Infection and Immunity.

[123]  S. Miller,et al.  A conserved amino acid sequence directing intracellular type III secretion by Salmonella typhimurium. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[124]  Zhao-Qing Luo,et al.  Comprehensive Identification of Protein Substrates of the Dot/Icm Type IV Transporter of Legionella pneumophila , 2011, PloS one.

[125]  Waldemar Vollmer,et al.  Type VI secretion delivers bacteriolytic effectors to target cells , 2011, Nature.

[126]  Tao Jiang,et al.  Computational prediction of type III secreted proteins from gram-negative bacteria , 2010, BMC Bioinformatics.

[127]  S. Pukatzki,et al.  The Vibrio cholerae type VI secretion system displays antimicrobial properties , 2010, Proceedings of the National Academy of Sciences.

[128]  Rekha R Meyer,et al.  Comparison of genome degradation in Paratyphi A and Typhi, human-restricted serovars of Salmonella enterica that cause typhoid , 2004, Nature Genetics.

[129]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[130]  Samuel I. Miller,et al.  Structural characterization of the molecular platform for type III secretion system assembly , 2005, Nature.

[131]  Emmanuel Albina,et al.  Searching algorithm for type IV secretion system effectors 1.0: a tool for predicting type IV effectors and exploring their genomic context , 2013, Nucleic acids research.

[132]  P. Christie,et al.  Bacterial type IV secretion: conjugation systems adapted to deliver effector molecules to host cells. , 2000, Trends in microbiology.

[133]  Thomas F Meyer,et al.  Type IV secretion systems and their effectors in bacterial pathogenesis. , 2006, Current opinion in microbiology.

[134]  D. Goodlett,et al.  A widespread bacterial type VI secretion effector superfamily identified using a heuristic approach. , 2012, Cell host & microbe.

[135]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[136]  Darrell Desveaux,et al.  Secretion : translocation of a protein to the outside of the cell System : the secretion apparatus Cargo : the translocated polypeptide , 2014 .

[137]  B. Vinatzer,et al.  Identifying type III effectors of plant pathogens and analyzing their interaction with plant cells. , 2003, Current opinion in microbiology.

[138]  J. Paavonen,et al.  Chlamydia trachomatis: impact on human reproduction. , 1999, Human reproduction update.

[139]  R D Appel,et al.  Protein identification and analysis tools in the ExPASy server. , 1999, Methods in molecular biology.