Prediction of bacterial associations with plants using a supervised machine-learning approach.

Recent scenarios of fresh produce contamination by human enteric pathogens have resulted in severe food-borne outbreaks, and a new paradigm has emerged stating that some human-associated bacteria can use plants as secondary hosts. As a consequence, there has been growing concern in the scientific community about these interactions that have not yet been elucidated. Since this is a relatively new area, there is a lack of strategies to address the problem of food-borne illnesses due to the ingestion of fruits and vegetables. In the present study, we performed specific genome annotations to train a supervised machine-learning model that allows for the identification of plant-associated bacteria with a precision of ∼93%. The application of our method to approximately 9500 genomes predicted several unknown interactions between well-known human pathogens and plants, and it also confirmed several cases for which evidence has been reported. We observed that factors involved in adhesion, the deconstruction of the plant cell wall and detoxifying activities were highlighted as the most predictive features. The application of our strategy to sequenced strains that are involved in food poisoning can be used as a primary screening tool to determine the possible causes of contaminations.

[1]  Scotland Leman,et al.  PAMDB, a multilocus sequence typing and analysis database and website for plant-associated microbes. , 2010, Phytopathology.

[2]  Gustavo E. Vazquez,et al.  Reduced Set of Virulence Genes Allows High Accuracy Prediction of Bacterial Pathogenicity in Humans , 2012, PloS one.

[3]  Anne-Christin Hauschild,et al.  On the limits of computational functional genomics for bacterial lifestyle prediction , 2014, GCB.

[4]  Siv G. E. Andersson,et al.  genoPlotR: comparative gene and genome visualization in R , 2010, Bioinform..

[5]  E. K. Kemsley,et al.  Phylogenetic distribution of traits associated with plant colonization in Escherichia coli. , 2013, Environmental microbiology.

[6]  Morten Nielsen,et al.  In Silico Prediction of Human Pathogenicity in the γ-Proteobacteria , 2010, PloS one.

[7]  B. Lugtenberg,et al.  Plant-growth-promoting rhizobacteria. , 2009, Annual review of microbiology.

[8]  J. Vivanco,et al.  Pseudomonas aeruginosa-Plant Root Interactions. Pathogenicity, Biofilm Formation, and Root Exudation1 , 2004, Plant Physiology.

[9]  B. Tall,et al.  Cronobacter spp.--opportunistic food-borne pathogens. A review of their virulence and environmental-adaptive traits. , 2014, Journal of medical microbiology.

[10]  Daniel G. Lee,et al.  The broad host range pathogen Pseudomonas aeruginosa strain PA14 carries two pathogenicity islands harboring plant and animal virulence genes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11]  A. L. Iniguez,et al.  Nitrogen fixation in wheat provided by Klebsiella pneumoniae 342. , 2004, Molecular plant-microbe interactions : MPMI.

[12]  J. Mercado-Blanco,et al.  Root Hairs Play a Key Role in the Endophytic Colonization of Olive Roots by Pseudomonas spp. with Biocontrol Activity , 2011, Microbial Ecology.

[13]  S. Massa,et al.  Isolation of Yersinia enterocolitica and related species from river water. , 1988, Zentralblatt fur Mikrobiologie.

[14]  Bryan S. Biehl,et al.  Annotation and overview of the Pseudomonas savastanoi pv. savastanoi NCPPB 3335 draft genome reveals the virulence gene complement of a tumour-inducing pathogen of woody hosts. , 2010, Environmental microbiology.

[15]  Kurt Hornik,et al.  Open-source machine learning: R meets Weka , 2009, Comput. Stat..

[16]  J. Vanderleyden,et al.  Auxin and plant-microbe interactions. , 2011, Cold Spring Harbor perspectives in biology.

[17]  P. Griffin,et al.  Factory outbreak of Escherichia coli O157:H7 infection in Japan. , 1999, Emerging infectious diseases.

[18]  P. Cossart,et al.  Bacterial Adhesion and Entry into Host Cells , 2006, Cell.

[19]  A. Chakrabarty,et al.  Characterization of the alginate biosynthetic gene cluster in Pseudomonas syringae pv. syringae , 1997, Journal of bacteriology.

[20]  F. Ausubel,et al.  Positive Correlation between Virulence ofPseudomonas aeruginosa Mutants in Mice and Insects , 2000, Journal of bacteriology.

[21]  V. Barbe,et al.  Sensing and adhesion are adaptive functions in the plant pathogenic xanthomonads , 2011, BMC Evolutionary Biology.

[22]  J. M. Dow,et al.  Expression of the gum operon directing xanthan biosynthesis in Xanthomonas campestris and its regulation in planta. , 2001, Molecular plant-microbe interactions : MPMI.

[23]  Leighton Pritchard,et al.  Colonization outwith the colon: plants as an alternative environmental reservoir for human pathogenic enterobacteria. , 2009, FEMS microbiology reviews.

[24]  J. Martínez,et al.  Functional role of bacterial multidrug efflux pumps in microbial natural ecosystems. , 2009, FEMS microbiology reviews.

[25]  Pedro Manuel Martínez-García,et al.  T346Hunter: A Novel Web-Based Tool for the Prediction of Type III, Type IV and Type VI Secretion Systems in Bacterial Genomes , 2015, PloS one.

[26]  D. Expert,et al.  Erwinia chrysanthemi requires a second iron transport route dependent of the siderophore achromobactin for extracellular growth and plant infection , 2004, Molecular microbiology.

[27]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[28]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[29]  Jun Yu,et al.  VFDB: a reference database for bacterial virulence factors , 2004, Nucleic Acids Res..

[30]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[31]  M. Lindeberg Information Management of Genome Enabled Data Streams for Pseudomonas syringae on the Pseudomonas-Plant Interaction (PPI) Website , 2011, Genes.

[32]  P. Rodríguez-Palenzuela,et al.  Inactivation of the sapA to sapF Locus of Erwinia chrysanthemi Reveals Common Features in Plant and Animal Bacterial Pathogenesis , 1998, Plant Cell.

[33]  F. Barras,et al.  EXTRACELLULAR ENZYMES AND PATHOGENESIS OF SOFT-ROT ERWINIA , 1994 .

[34]  Rashmi Pant,et al.  The Pathogen-Host Interactions database (PHI-base): additions and future developments , 2014, Nucleic Acids Res..

[35]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[36]  John Stavrinides,et al.  Insights into Cross-Kingdom Plant Pathogenic Bacteria , 2011, Genes.

[37]  J. Glasner,et al.  Identification of host-microbe interaction factors in the genomes of soft rot-associated pathogens Dickeya dadantii 3937 and Pectobacterium carotovorum WPP14 with supervised machine learning , 2014, BMC Genomics.

[38]  S. Gnanamanickam Plant: Associated Bacteria , 2008 .

[39]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[40]  S. Heu,et al.  The Interaction of Human Enteric Pathogens with Plants , 2014, The plant pathology journal.

[41]  M. Schmidt,et al.  Type II secretion in Yersinia—a secretion system for pathogenicity and environmental fitness , 2012, Front. Cell. Inf. Microbio..

[42]  L. C. Loon Plant responses to plant growth-promoting rhizobacteria , 2007, European Journal of Plant Pathology.

[43]  Matthias Greiner,et al.  German outbreak of Escherichia coli O104:H4 associated with sprouts. , 2011, The New England journal of medicine.

[44]  Frank B Gertler,et al.  The growth cone cytoskeleton in axon outgrowth and guidance. , 2011, Cold Spring Harbor perspectives in biology.

[45]  G. Salmond,et al.  Top 10 plant pathogenic bacteria in molecular plant pathology. , 2012, Molecular plant pathology.

[46]  Didier Raoult,et al.  Yersinia massiliensis sp. nov., isolated from fresh water. , 2008, International journal of systematic and evolutionary microbiology.

[47]  L. Eberl,et al.  Evidence for a plant-associated natural habitat for Cronobacter spp. , 2009, Research in microbiology.

[48]  C. Ramos,et al.  Complete genome sequence of Pseudomonas fluorescens strain PICF7, an indigenous root endophyte from olive (Olea europaea L.) and effective biocontrol agent against Verticillium dahliae , 2015, Standards in Genomic Sciences.

[49]  D. Gross,et al.  Pseudomonas syringae Phytotoxins: Mode of Action, Regulation, and Biosynthesis by Peptide and Polyketide Synthetases , 1999, Microbiology and Molecular Biology Reviews.

[50]  X. Chen,et al.  Random forests for genomic data analysis. , 2012, Genomics.

[51]  R. Dudler The role of bacterial phytotoxins in inhibiting the eukaryotic proteasome. , 2014, Trends in microbiology.

[52]  Raymond Lo,et al.  Pseudomonas Genome Database: improved comparative analysis and population genomics capability for Pseudomonas genomes , 2010, Nucleic Acids Res..

[53]  E. Martínez-Romero,et al.  Klebsiella variicola, a novel species with clinical and plant-associated isolates. , 2004, Systematic and applied microbiology.

[54]  J. Vanderleyden,et al.  Synthesis of phytohormones by plant-associated bacteria. , 1995, Critical reviews in microbiology.

[55]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[56]  M. Newman,et al.  The role of lipopolysaccharides in induction of plant defence responses. , 2003, Molecular plant pathology.

[57]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[58]  R. Ivanek,et al.  Modeling on-farm Escherichia coli O157:H7 population dynamics. , 2009, Foodborne pathogens and disease.

[59]  P. Rodríguez-Palenzuela,et al.  Evidence against a direct antimicrobial role of H2O2 in the infection of plants by Erwinia chrysanthemi. , 2000, Molecular plant-microbe interactions : MPMI.

[60]  J. M. Dow,et al.  Biofilm formation, epiphytic fitness, and canker development in Xanthomonas axonopodis pv. citri. , 2007, Molecular plant-microbe interactions : MPMI.

[61]  T. Rattei,et al.  Complete Genome Sequence of Cronobacter turicensis LMG 23827, a Food-Borne Pathogen Causing Deaths in Neonates , 2010, Journal of bacteriology.

[62]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[63]  M. Berrocal-Lobo,et al.  Antibiotic activities of peptides, hydrogen peroxide and peroxynitrite in plant defence , 2001, FEBS letters.

[64]  C. Ramos,et al.  Bioinformatics Analysis of the Complete Genome Sequence of the Mango Tree Pathogen Pseudomonas syringae pv. syringae UMAF0158 Reveals Traits Relevant to Virulence and Epiphytic Lifestyle , 2015, PloS one.

[65]  N. Murai Review: Plant Growth Hormone Cytokinins Control the Crop Seed Yield , 2014 .

[66]  C. Robin Buell,et al.  The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes , 2011, Database J. Biol. Databases Curation.

[67]  Frederick M. Ausubel,et al.  Molecular Mechanisms of Bacterial Virulence Elucidated Using a Pseudomonas Aeruginosa– Caenorhabditis Elegans Pathogenesis Model , 2022 .

[68]  L. Mauer,et al.  Internalization of E. coli O157:H7 and Salmonella spp. in plants: A review , 2012 .

[69]  A. Moya,et al.  Serratia symbiotica from the Aphid Cinara cedri: A Missing Link from Facultative to Obligate Insect Endosymbiont , 2011, PLoS genetics.

[70]  M. Anisimova,et al.  Repertoire, unified nomenclature and evolution of the Type III effector gene set in the Ralstonia solanacearum species complex , 2013, BMC Genomics.

[71]  C. Manaia,et al.  Pseudomonas thermotolerans sp. nov., a thermotolerant species of the genus Pseudomonas sensu stricto. , 2002, International journal of systematic and evolutionary microbiology.

[72]  F. Barras,et al.  The minimal gene set member msrA, encoding peptide methionine sulfoxide reductase, is a virulence determinant of the plant pathogen Erwinia chrysanthemi. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[73]  U. Messelhäusser,et al.  The Genus Bacillus , 2014 .

[74]  K. Eversole,et al.  Human Pathogens on Plants: Designing a Multidisciplinary Strategy for Research. , 2014, Phytopathology.

[75]  Peng Sun,et al.  Density parameter estimation for finding clusters of homologous proteins - tracing actinobacterial pathogenicity lifestyles , 2013, Bioinform..

[76]  D Goerke [At the limits]. , 1980, Krankenpflege Journal.

[77]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[78]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[79]  C. Hedberg,et al.  Food-related illness and death in the United States. , 1999, Emerging infectious diseases.

[80]  C. Allen,et al.  Ralstonia solanacearum Dps Contributes to Oxidative Stress Tolerance and to Colonization of and Virulence on Tomato Plants , 2010, Applied and Environmental Microbiology.

[81]  M. Sadowsky,et al.  Enteric Pathogen-Plant Interactions: Molecular Connections Leading to Colonization and Growth and Implications for Food Safety , 2014, Microbes and environments.

[82]  H. Hirt,et al.  Plants as alternative hosts for Salmonella. , 2012, Trends in plant science.

[83]  S. Holmström,et al.  Siderophores in environmental research: roles and applications , 2014, Microbial biotechnology.

[84]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[85]  Ian K Toth,et al.  Soft rot erwiniae: from genes to genomes. , 2003, Molecular plant pathology.

[86]  C. R. Lovell,et al.  Nitrogen Fixation by Vibrio parahaemolyticus and Its Implications for a New Ecological Niche , 2007, Applied and Environmental Microbiology.

[87]  F. Ausubel,et al.  Common virulence factors for bacterial pathogenicity in plants and animals. , 1995, Science.

[88]  Steven Salzberg,et al.  Identifying bacterial genes and endosymbiont DNA with Glimmer , 2007, Bioinform..