A maximum entropy approach to species distribution modeling

We study the problem of modeling species geographic distributions, a critical problem in conservation biology. We propose the use of maximum-entropy techniques for this problem, specifically, sequential-update algorithms that can handle a very large number of features. We describe experiments comparing maxent with a standard distribution-modeling tool, called GARP, on a dataset containing observation data for North American breeding birds. We also study how well maxent performs as a function of the number of training examples and training time, analyze the use of regularization to avoid overfitting when the number of examples is small, and explore the interpretability of models constructed using maxent.

[1]  A. Peterson,et al.  Predicting the potential invasive distributions of four alien plant species in North America , 2003, Weed Science.

[2]  Joshua Goodman,et al.  Exponential Priors for Maximum Entropy Models , 2004, NAACL.

[3]  J. Elith Quantitative Methods for Modeling Species Habitat: Comparative Performance and an Application to Australian Plants , 2000 .

[4]  Joshua Goodman,et al.  Sequential Conditional Generalized Iterative Scaling , 2002, ACL.

[5]  P. Jones,et al.  REPRESENTING TWENTIETH CENTURY SPACE-TIME CLIMATE VARIABILITY. , 1998 .

[6]  O. Phillips,et al.  Extinction risk from climate change , 2004, Nature.

[7]  Robert P. Anderson,et al.  Modeling species’ geographic distributions for preliminary conservation assessments: an implementation with the spiny pocket mice (Heteromys) of Ecuador , 2004 .

[8]  Ronald Rosenfeld,et al.  A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..

[9]  A. Peterson,et al.  PREDICTING SPECIES' GEOGRAPHIC DISTRIBUTIONS BASED ON ECOLOGICAL NICHE MODELING , 2001 .

[10]  W. Link,et al.  The North American Breeding Bird Survey Results and Analysis , 1997 .

[11]  C. Stern CONCLUDING REMARKS OF THE CHAIRMAN , 1950 .

[12]  Robert P. Anderson,et al.  Evaluating predictive models of species’ distributions: criteria for selecting optimal models , 2003 .

[13]  Zoubin Ghahramani,et al.  On the Convergence of Bound Optimization Algorithms , 2002, UAI.

[14]  Thomas P. Minka,et al.  Algorithms for maximum-likelihood logistic regression , 2003 .

[15]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[16]  N. Gotelli Predicting Species Occurrences: Issues of Accuracy and Scale , 2003 .

[17]  David R. B. Stockwell,et al.  The GARP modelling system: problems and solutions to automated spatial prediction , 1999, Int. J. Geogr. Inf. Sci..

[18]  A. Peterson,et al.  Niche Modeling and Geographic Range Predictions in the Marine Environment Using a Machine-learning Algorithm , 2003 .

[19]  A. Peterson,et al.  Lutzomyia vectors for cutaneous leishmaniasis in Southern Brazil: ecological niche models, predicted geographic distributions, and climate change effects. , 2003, International journal for parasitology.

[20]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[21]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[22]  A. Peterson,et al.  Using Ecological‐Niche Modeling to Predict Barred Owl Invasions with Implications for Spotted Owl Conservation , 2003 .

[23]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  W. Ponder,et al.  Evaluation of Museum Collection Data for Use in Biodiversity Assessment , 2001 .

[25]  Miroslav Dudík,et al.  Performance Guarantees for Regularized Maximum Entropy Density Estimation , 2004, COLT.

[26]  P. Jones,et al.  Representing Twentieth-Century Space–Time Climate Variability. Part I: Development of a 1961–90 Mean Monthly Terrestrial Climatology , 1999 .

[27]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[28]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[29]  David R. B. Stockwell,et al.  Induction of sets of rules from animal distribution data: a robust and informative method of data analysis , 1992 .

[30]  A. Peterson,et al.  Predicting distributions of known and unknown reptile species in Madagascar , 2003, Nature.