StatEcoNet: Statistical Ecology Neural Networks for Species Distribution Modeling

This paper focuses on a core task in computational sustainability and statistical ecology: species distribution modeling (SDM). In SDM, the occurrence pattern of a species on a landscape is predicted by environmental features based on observations at a set of locations. At first, SDM may appear to be a binary classification problem, and one might be inclined to employ classic tools (e.g., logistic regression, support vector machines, neural networks) to tackle it. However, wildlife surveys introduce structured noise (especially under-counting) in the species observations. If unaccounted for, these observation errors systematically bias SDMs. To address the unique challenges of SDM, this paper proposes a framework called StatEcoNet. Specifically, this work employs a graphical generative model in statistical ecology to serve as the skeleton of the proposed computational framework and carefully integrates neural networks under the framework. The advantages of StatEcoNet over related approaches are demonstrated on simulated datasets as well as bird species data. Since SDMs are critical tools for ecological science and natural resource management, StatEcoNet may offer boosted computational and analytical powers to a wide range of applications that have significant social impacts, e.g., the study and conservation of threatened species.

[1]  C. Woodcock,et al.  Improvement and expansion of the Fmask algorithm: cloud, cloud shadow, and snow detection for Landsats 4–7, 8, and Sentinel 2 images , 2015 .

[2]  Matthew G. Betts,et al.  Dynamic occupancy models reveal within-breeding season movement up a habitat quality gradient by a migratory songbird , 2008 .

[3]  José J. Lahoz-Monfort,et al.  Ignoring Imperfect Detection in Biological Surveys Is Dangerous: A Response to ‘Fitting and Interpreting Occupancy Models' , 2014, PloS one.

[4]  Zhe Zhu,et al.  Object-based cloud and cloud shadow detection in Landsat imagery , 2012 .

[5]  Marc E. Pfetsch,et al.  A compact formulation for the l21 mixed-norm minimization problem , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  J. Andrew Royle N‐Mixture Models for Estimating Population Size from Spatially Replicated Counts , 2004, Biometrics.

[7]  Thomas G. Dietterich,et al.  Incorporating Boosted Regression Trees into Ecological Latent Variable Models , 2011, AAAI.

[8]  J. Andrew Royle,et al.  ESTIMATING SITE OCCUPANCY RATES WHEN DETECTION PROBABILITIES ARE LESS THAN ONE , 2002, Ecology.

[9]  Dawei Cheng,et al.  Spatio-Temporal Attention-Based Neural Network for Credit Card Fraud Detection , 2020, AAAI.

[10]  D. MacKenzie Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence , 2005 .

[11]  A. Townsend Peterson,et al.  Novel methods improve prediction of species' distributions from occurrence data , 2006 .

[12]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[13]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[14]  D. Fink,et al.  Spatiotemporal exploratory models for broad-scale survey data. , 2010, Ecological applications : a publication of the Ecological Society of America.

[15]  Jane Elith,et al.  blockCV: an R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models , 2018, bioRxiv.

[16]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[17]  Hankui K. Zhang,et al.  Characterization of Landsat-7 to Landsat-8 reflective wavelength and normalized difference vegetation index continuity. , 2016, Remote sensing of environment.

[18]  Jennifer A. Miller,et al.  Mapping Species Distributions: Spatial Inference and Prediction , 2010 .

[19]  Maxwell B. Joseph,et al.  Neural hierarchical models of ecological populations , 2019, bioRxiv.

[20]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[21]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[22]  Chris E. Jordan,et al.  Attribution of disturbance change agent from Landsat time-series in support of habitat monitoring in the Puget Sound region, USA , 2015 .

[23]  Brendan A. Wintle,et al.  Imperfect detection impacts the performance of species distribution models , 2014 .

[24]  Michael Dixon,et al.  Google Earth Engine: Planetary-scale geospatial analysis for everyone , 2017 .

[25]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[26]  Neil Flood,et al.  Seasonal Composite Landsat TM/ETM+ Images Using the Medoid (a Multi-Dimensional Median) , 2013, Remote. Sens..

[27]  J. Andrew Royle,et al.  Hierarchical Modeling and Inference in Ecology: The Analysis of Data from Populations, Metapopulations and Communities , 2008 .

[28]  Kevin McGarigal,et al.  Species distribution modelling for the people: unclassified landsat TM imagery predicts bird occurrence at fine resolutions , 2013 .

[29]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .

[30]  Gordon B. Stenhouse,et al.  A history of habitat dynamics: Characterizing 35 years of stand replacing disturbance , 2011 .

[31]  M. Araújo,et al.  Uses and misuses of bioclimatic envelope modeling. , 2012, Ecology.