EffectorO: motif-independent prediction of effectors in oomycete genomes using machine learning and lineage-specificity

Oomycete plant pathogens cause a wide variety of diseases, including late blight of potato, sudden oak death, and downy mildews of plants. These pathogens are major contributors to loss in numerous food crops. Oomycetes secrete effector proteins to manipulate their hosts to the advantage of the pathogen. Plants have evolved to recognize effectors, resulting in an evolutionary cycle of defense and counter-defense in plant–microbe interactions. This selective pressure results in highly diverse effector sequences that can be difficult to computationally identify using only sequence similarity. We developed a novel effector prediction tool, EffectorO, that uses two complementary approaches to predict effectors in oomycete pathogen genomes: (1) a machine learning-based pipeline that predicts effector probability based on the biochemical properties of the N-terminal amino acid sequence of a protein and (2) a pipeline based on lineage-specificity to find proteins that are unique to one species or genus, a sign of evolutionary divergence due to adaptation to the host. We tested EffectorO on Bremia lactucae, which causes lettuce downy mildew, and Phytophthora infestans, which causes late blight of potato and tomato, and predicted many novel effector candidates, while recovering the majority of known effector candidates. EffectorO will be useful for discovering novel families of oomycete effectors without relying on sequence similarity to known effectors.

[1]  O. S.,et al.  Accurate prediction of protein structures and interactions using a three-track neural network , 2022, Yearbook of Paediatric Endocrinology.

[2]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[3]  C. Faso,et al.  The Road Less Traveled? Unconventional Protein Secretion at Parasite–Host Interfaces , 2021, Frontiers in Cell and Developmental Biology.

[4]  K. Krasileva,et al.  Computational structural genomics unravels common folds and predicted functions in the secretome of fungal phytopathogen Magnaporthe oryzae , 2021, bioRxiv.

[5]  Silvio C. E. Tosatto,et al.  Pfam: The protein families database in 2021 , 2020, Nucleic Acids Res..

[6]  A. Murray,et al.  Many, but not all, lineage-specific genes can be explained by homology detection failure , 2020, bioRxiv.

[7]  David T. Jones,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[8]  Jianyi Yang,et al.  Improved protein structure prediction using predicted interresidue orientations , 2019, Proceedings of the National Academy of Sciences.

[9]  Andrew D. Yates,et al.  PHI-base: the pathogen–host interactions database , 2019, Nucleic Acids Res..

[10]  B. Tyler,et al.  Defense and Counterdefense During Plant-Pathogenic Oomycete Infection. , 2019, Annual review of microbiology.

[11]  R. Michelmore,et al.  Effector prediction and characterization in the oomycete pathogen Bremia lactucae reveal host-recognized WY domain proteins that lack the canonical RXLR motif , 2019, bioRxiv.

[12]  R. Michelmore,et al.  Genomic signatures of heterokaryosis in the oomycete pathogen Bremia lactucae , 2019, Nature Communications.

[13]  J. Gan,et al.  Structural analysis of Phytophthora suppressor of RNA silencing 2 (PSR2) reveals a conserved modular fold contributing to virulence , 2019, Proceedings of the National Academy of Sciences.

[14]  J. Gouzy,et al.  Sunflower resistance to multiple downy mildew pathotypes revealed by recognition of conserved effectors of the oomycete Plasmopara halstedii , 2019, The Plant Journal.

[15]  G. Van den Ackerveken,et al.  Recognition of lettuce downy mildew effector BLR38 in Lactuca serriola LS102 requires two unlinked loci , 2018, Molecular plant pathology.

[16]  Jana Sperschneider,et al.  Improved prediction of fungal effector proteins from secretomes with EffectorP 2.0 , 2018, bioRxiv.

[17]  Ryan G. Anderson,et al.  Conserved RxLR Effectors From Oomycetes Hyaloperonospora arabidopsidis and Phytophthora sojae Suppress PAMP- and Effector-Triggered Immunity in Diverse Plants. , 2017, Molecular plant-microbe interactions : MPMI.

[18]  D. Fitzpatrick,et al.  Genomic, Network, and Phylogenetic Analysis of the Oomycete Effector Arsenal , 2017, mSphere.

[19]  D. Shen,et al.  Intrinsic disorder is a common structural characteristic of RxLR effectors in oomycete pathogens. , 2017, Fungal biology.

[20]  R. Visser,et al.  Effector‐mediated discovery of a novel resistance gene against Bremia lactucae in a nonhost lettuce species , 2017, The New phytologist.

[21]  Yasin F. Dagdas,et al.  Emerging oomycete threats to plants and animals , 2016, Philosophical Transactions of the Royal Society B: Biological Sciences.

[22]  Jana Sperschneider,et al.  EffectorP: predicting fungal effector proteins from secretomes using machine learning. , 2016, The New phytologist.

[23]  S. Raffaele,et al.  The two-speed genomes of filamentous pathogens: waltz with plants. , 2015, Current opinion in genetics & development.

[24]  G. Van den Ackerveken,et al.  Genome analyses of the sunflower pathogen Plasmopara halstedii provide insights into effector evolution in downy mildews and Phytophthora , 2015, BMC Genomics.

[25]  M. Zou,et al.  Data in support of genome-wide identification of lineage-specific genes within Caenorhabditis elegans , 2015, Data in brief.

[26]  J. Xu,et al.  Unconventionally secreted effectors of two filamentous pathogens target plant salicylate biosynthesis , 2014, Nature Communications.

[27]  Xiaoyu Liu,et al.  Functionally Redundant RXLR Effectors from Phytophthora infestans Act at Different Steps to Suppress Early flg22-Triggered Immunity , 2014, PLoS pathogens.

[28]  R. Oliver,et al.  Effectors as tools in disease resistance breeding against biotrophic, hemibiotrophic, and necrotrophic plant pathogens. , 2014, Molecular plant-microbe interactions : MPMI.

[29]  G. Van den Ackerveken,et al.  Specific in planta recognition of two GKLR proteins of the downy mildew Bremia lactucae revealed in a large effector screen in lettuce. , 2013, Molecular plant-microbe interactions : MPMI.

[30]  Vladimir N Uversky,et al.  Intrinsic Disorder in Pathogen Effectors: Protein Flexibility as an Evolutionary Hallmark in a Molecular Arms Race[W] , 2013, Plant Cell.

[31]  R. Michelmore,et al.  Impacts of resistance gene genetics, function, and evolution on a durable future. , 2013, Annual review of phytopathology.

[32]  B. Snel,et al.  Distinctive Expansion of Potential Virulence Genes in the Genome of the Oomycete Fish Pathogen Saprolegnia parasitica , 2013, PLoS genetics.

[33]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[34]  D. Wemmer,et al.  Hyaloperonospora arabidopsidis Effector Protein ATR13 , 2012 .

[35]  K. Shirasu,et al.  Sequence Divergent RXLR Effectors Share a Structural Fold Conserved across Plant Pathogenic Oomycete Species , 2012, PLoS pathogens.

[36]  Shiv D. Kale,et al.  Rust Secreted Protein Ps87 Is Conserved in Diverse Fungal Pathogens and Contains a RXLR-like Motif Sufficient for Translocation into Plant Cells , 2011, PloS one.

[37]  Jonathan D. G. Jones,et al.  Multiple Candidate Effectors from the Oomycete Pathogen Hyaloperonospora arabidopsidis Suppress Host Plant Immunity , 2011, PLoS pathogens.

[38]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[39]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[40]  P. Dodds,et al.  Showdown at the RXLR motif: Serious differences of opinion in how effector proteins from filamentous eukaryotic pathogens enter plant cells , 2011, Proceedings of the National Academy of Sciences.

[41]  J. Holton,et al.  Hyaloperonospora arabidopsidis ATR1 effector is a repeat protein with distributed recognition surfaces , 2011, Proceedings of the National Academy of Sciences.

[42]  Jonathan D. G. Jones,et al.  Molecular cloning of ATR5(Emoy2) from Hyaloperonospora arabidopsidis, an avirulence determinant that triggers RPP5-mediated defense in Arabidopsis. , 2011, Molecular plant-microbe interactions : MPMI.

[43]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[44]  Marco Thines,et al.  Signatures of Adaptation to Obligate Biotrophy in the Hyaloperonospora arabidopsidis Genome , 2010, Science.

[45]  Jeppe Emmersen,et al.  Powdery mildew fungal effector candidates share N-terminal Y/F/WxC-motif , 2010, BMC Genomics.

[46]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[47]  Jonathan D. G. Jones,et al.  Mapping and cloning of late blight resistance genes from Solanum venturii using an interspecific candidate gene approach. , 2009, Molecular plant-microbe interactions : MPMI.

[48]  S. Whisson,et al.  Towards understanding the virulence functions of RXLR effectors of the oomycete plant pathogen Phytophthora infestans. , 2009, Journal of experimental botany.

[49]  Shiv D. Kale,et al.  RXLR-Mediated Entry of Phytophthora sojae Effector Avr1b into Soybean Cells Does Not Require Pathogen-Encoded Machinery[W] , 2008, The Plant Cell Online.

[50]  A. J. Haverkort,et al.  Societal Costs of Late Blight in Potato and Prospects of Durable Resistance Through Cisgenic Modification , 2008, Potato Research.

[51]  S. Kamoun,et al.  Adaptive evolution has targeted the C-terminal domain of the RXLR effectors of plant pathogenic oomycetes , 2008, The Plant cell.

[52]  Leighton Pritchard,et al.  A translocation signal for delivery of oomycete effector proteins into host plant cells , 2007, Nature.

[53]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[54]  B. Kobe,et al.  Direct protein interaction underlies gene-for-gene specificity and coevolution of the flax resistance genes and flax rust avirulence genes. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[56]  J. Beynon,et al.  Differential Recognition of Highly Divergent Downy Mildew Avirulence Gene Alleles by RPP1 Resistance Genes from Two Arabidopsis Lines , 2005, The Plant Cell Online.

[57]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[58]  S. Baldauf,et al.  The Deep Roots of Eukaryotes , 2003, Science.

[59]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[60]  Christopher J. Oldfield,et al.  Intrinsically disordered protein. , 2001, Journal of molecular graphics & modelling.

[61]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[62]  W. Stansfield,et al.  Encyclopedic dictionary of genetics , 1990 .

[63]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[64]  J. Janin,et al.  Surface and inside volumes in globular proteins , 1979, Nature.

[65]  J. M. Zimmerman,et al.  The characterization of amino acid sequences in proteins by statistical methods. , 1968, Journal of theoretical biology.