Machine learning algorithms to infer trait‐matching and predict species interactions in ecological networks

Ecologists have long suspected that species are more likely to interact if their traits match in a particular way. For example, a pollination interaction may be more likely if the proportions of a bee's tongue fit a plant's flower shape. Empirical estimates of the importance of trait-matching for determining species interactions, however, vary significantly among different types of ecological networks. Here, we show that ambiguity among empirical trait-matching studies may have arisen at least in parts from using overly simple statistical models. Using simulated and real data, we contrast conventional generalized linear models (GLM) with more flexible Machine Learning (ML) models (Random Forest, Boosted Regression Trees, Deep Neural Networks, Convolutional Neural Networks, Support Vector Machines, naive Bayes, and k-Nearest-Neighbor), testing their ability to predict species interactions based on traits, and infer trait combinations causally responsible for species interactions. We find that the best ML models can successfully predict species interactions in plant-pollinator networks, outperforming GLMs by a substantial margin. Our results also demonstrate that ML models can better identify the causally responsible trait-matching combinations than GLMs. In two case studies, the best ML models successfully predicted species interactions in a global plant-pollinator database and inferred ecologically plausible trait-matching rules for a plant-hummingbird network, without any prior assumptions. We conclude that flexible ML models offer many advantages over traditional regression models for understanding interaction networks. We anticipate that these results extrapolate to other ecological network types. More generally, our results highlight the potential of machine learning and artificial intelligence for inference in ecology, beyond standard tasks such as image or pattern recognition.

[1]  Bogdan E. Popescu,et al.  PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.

[2]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[3]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[4]  Andy Purvis,et al.  Functional traits, the phylogeny of function, and ecosystem service vulnerability , 2013, Ecology and evolution.

[5]  Dominique Gravel,et al.  Beyond species: why ecological interaction networks vary through space and time , 2014, bioRxiv.

[6]  Ingo Steinwart,et al.  liquidSVM: A Fast and Versatile SVM package , 2017, ArXiv.

[7]  Jochen Fründ,et al.  Predicting ecosystem functions from biodiversity and mutualistic networks: an extension of trait-based concepts to plant - animal interactions , 2015 .

[8]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[9]  D. Janzen On Ecological Fitting , 1985 .

[10]  S. C. Olhede,et al.  The growing ubiquity of algorithms in society: implications, impacts and innovations , 2018, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  P. Klinkhamer,et al.  Asymmetric specialization and extinction risk in plant–flower visitor webs: a matter of morphology or abundance? , 2007, Oecologia.

[12]  W. Kress,et al.  Effect of flower shape and size on foraging performance and trade-offs in a tropical hummingbird. , 2009, Ecology.

[13]  Michele R. Dudash,et al.  Quantifying hummingbird preference for floral trait combinations: The role of selection on trait interactions in the evolution of pollination syndromes , 2015, Evolution; international journal of organic evolution.

[14]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[15]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[16]  Nils Blüthgen,et al.  Specialization, Constraints, and Conflicting Interests in Mutualistic Networks , 2007, Current Biology.

[17]  E. Marchiori,et al.  Predicting Drug-Target Interactions for New Drug Compounds Using a Weighted Nearest Neighbor Profile , 2013, PloS one.

[18]  Dominique Gravel,et al.  Identifying a common backbone of interactions underlying food webs from different ecosystems , 2018, Nature Communications.

[19]  Daniel B. Stouffer,et al.  Higher-order interactions capture unexplained complexity in diverse communities , 2017, Nature Ecology &Evolution.

[20]  P. Jordano,et al.  PAPER Functional relationships beyond species richness patterns: trait matching in plant-bird mutualisms across scales , 2014 .

[21]  Jiansong Fang,et al.  Predictions of BuChE Inhibitors Using Support Vector Machine and Naive Bayesian Classification Techniques in Drug Discovery , 2013, J. Chem. Inf. Model..

[22]  J. Harding,et al.  Inferring predator–prey interactions in food webs , 2018, Methods in Ecology and Evolution.

[23]  David J. Klein,et al.  A convolutional neural network for detecting sea turtles in drone imagery , 2018, Methods in Ecology and Evolution.

[24]  Philippe Desjardins-Proulx,et al.  Ecological interactions and the Netflix problem , 2016, bioRxiv.

[25]  Jeremy W. Fox,et al.  Species traits and abundances predict metrics of plant–pollinator network structure, but not pairwise interactions , 2015 .

[26]  C. Graham,et al.  Persistent bill and corolla matching despite shifting temporal resources in tropical hummingbird-plant interactions. , 2017, Ecology letters.

[27]  Dirk Husmeier,et al.  Hierarchical Bayesian models in ecology: Reconstructing species interaction networks from non-homogeneous species abundance data , 2012, Ecol. Informatics.

[28]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[29]  Julio Saez-Rodriguez,et al.  Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties , 2012, PloS one.

[30]  Yang Li,et al.  GPCR-drug interactions prediction using random forest with drug-association-matrix-based post-processing procedure , 2016, Comput. Biol. Chem..

[31]  Jens M. Olesen,et al.  Centrality measures and the importance of generalist species in pollination networks , 2010 .

[32]  Florian Altermatt,et al.  Predicting novel trophic interactions in a non-native world. , 2013, Ecology letters.

[33]  Zhongming Zhao,et al.  Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. , 2014, Journal of the American Medical Informatics Association : JAMIA.

[34]  Enrique Alonso García,et al.  Towards global data products of Essential Biodiversity Variables on species traits , 2018, Nature Ecology & Evolution.

[35]  Marlies Sazima,et al.  Processes entangling interactions in communities: forbidden links are more important than abundance in a hummingbird–plant network , 2014, Proceedings of the Royal Society B: Biological Sciences.

[36]  Stefano Allesina,et al.  The dimensionality of ecological networks. , 2013, Ecology letters.

[37]  Omri Allouche,et al.  Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS) , 2006 .

[38]  N. Blüthgen,et al.  Morphological traits determine specialization and resource use in plant–hummingbird networks in the neotropics , 2014 .

[39]  Carsten F. Dormann,et al.  A method for detecting modules in quantitative bipartite networks , 2013, 1304.3218.

[40]  P. Jordano,et al.  Morphology predicts species' functional roles and their degree of specialization in plant–frugivore interactions , 2016, Proceedings of the Royal Society B: Biological Sciences.

[41]  Masahiro Ryo,et al.  Statistically reinforced machine learning for nonlinear patterns and variable interactions , 2017 .

[42]  Michael A. Tabak,et al.  Machine learning to classify animal species in camera trap images: applications in ecology , 2018 .

[43]  Dominique Gravel,et al.  Inferring food web structure from predator–prey body size relationships , 2013 .

[44]  Xiaolong Wang,et al.  Drug-Drug Interaction Extraction via Convolutional Neural Networks , 2016, Comput. Math. Methods Medicine.

[45]  M. Murúa,et al.  Pollination syndromes in a specialised plant-pollinator interaction: does floral morphology predict pollinators in Calceolaria? , 2015, Plant biology.

[46]  Miguel G. Matias,et al.  Inferring biotic interactions from proxies. , 2015, Trends in ecology & evolution.

[47]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[48]  Michele R. Dudash,et al.  Pollination Syndromes and Floral Specialization , 2004 .

[49]  M. V. Price,et al.  A global test of the pollination syndrome hypothesis. , 2009, Annals of botany.

[50]  N. Mouquet,et al.  Empirical Evaluation of Neutral Interactions in Host-Parasite Networks , 2014, The American Naturalist.

[51]  Carsten F. Dormann,et al.  Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure , 2017 .

[52]  W. Daniel Kissling,et al.  Multispecies interactions across trophic levels at macroscales: retrospective and future directions , 2015 .

[53]  L. Freitas,et al.  Do pollination syndromes cause modularity and predict interactions in a pollination network in tropical high‐altitude grasslands? , 2012 .

[54]  N. Blüthgen,et al.  Functional structure and specialization in three tropical plant-hummingbird interaction networks across an elevational gradient in Costa Rica , 2015 .

[55]  Ivan Rusyn,et al.  Modeling liver-related adverse effects of drugs using knearest neighbor quantitative structure-activity relationship method. , 2010, Chemical research in toxicology.

[56]  D. Gravel,et al.  Trait matching and phylogeny as predictors of predator–prey interactions involving ground beetles , 2018 .

[57]  J. Bascompte,et al.  Invariant properties in coevolutionary networks of plant-animal interactions , 2002 .

[58]  Neo D. Martinez,et al.  Simple prediction of interaction strengths in complex food webs , 2009, Proceedings of the National Academy of Sciences.

[59]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[60]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[61]  M. Fortin,et al.  The spatial scaling of species interaction networks , 2018, Nature Ecology & Evolution.

[62]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[63]  Ao Li,et al.  A novel heterogeneous network-based method for drug response prediction in cancer cell lines , 2018, Scientific Reports.

[64]  Jessica D. Petersen,et al.  Trait matching of flower visitors and crops predicts fruit set better than trait diversity , 2015 .

[65]  Dominique Gravel,et al.  A common framework for identifying linkage rules across different types of interactions , 2015, bioRxiv.

[66]  Abdollah Dehzangi,et al.  iDTI-ESBoost: Identification of Drug Target Interaction Using Evolutionary and Structural Features with Boosting , 2017, Scientific Reports.

[67]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[68]  Chitta Baral,et al.  Discovering drug–drug interactions: a text-mining and reasoning approach based on properties of drug metabolism , 2010, Bioinform..

[69]  Ming Wen,et al.  Deep-Learning-Based Drug-Target Interaction Prediction. , 2017, Journal of proteome research.

[70]  Artem Cherkasov,et al.  SimBoost: a read-across approach for predicting drug–target binding affinities using gradient boosting machines , 2017, Journal of Cheminformatics.

[71]  J. Biesmeijer,et al.  Safeguarding pollinators and their values to human well-being , 2016, Nature.

[72]  Fernanda S Valdovinos,et al.  Mutualistic networks: moving closer to a predictive theory. , 2019, Ecology letters.

[73]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[74]  Jeff Ollerton,et al.  How can an understanding of plant-pollinator interactions contribute to global food security? , 2015, Current opinion in plant biology.

[75]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[76]  M. Quesada,et al.  A quantitative review of pollination syndromes: do floral traits predict effective pollinators? , 2014, Ecology letters.

[77]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[78]  K. Böhning‐Gaese,et al.  Functionally specialised birds respond flexibly to seasonal changes in fruit availability , 2017, The Journal of animal ecology.