GIS and the Random Forest Predictor: Integration in R for Tick-Borne Disease Risk Assessment

We discuss how sophisticated machine learning methods may be rapidly integrated within a GIS for the development of new approaches in landscape epidemiology. A multitemporal predictive map is obtained by modeling in R, analyzing geodata and digital maps in GRASS, and managing biodata samples and weather data in PostgreSQL. In particular, we present a risk mapping system for tick-borne diseases, applied to model the risk of exposure to Lyme borreliosis and tick-borne encephalitis (TBE) in Trentino, Italian Alps.

[1]  Lubos Mitas,et al.  Multivariate Interpolation of Precipitation Using Regularized Spline with Tension , 2002, Trans. GIS.

[2]  Cesare Furlanello,et al.  Selection of Tree-Biased Classifiers with the Bootstrap 632+ Rule , 1997 .

[3]  Markus Neteler,et al.  Open Source GIS: A GRASS GIS Approach , 2007 .

[4]  Roger Bivand,et al.  Using the R statistical data analysis language on GRASS 5 , 2000 .

[5]  Markus Neteler,et al.  Open Source geocomputation: using the R data analysis language integrated with GRASS GIS and PostgreSQL data base systems. , 2000 .

[6]  Roger S. Bivand,et al.  Integrating GRASS 5.0 and R: GIS and modern statistics for data analysis , 1999 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[9]  Cesare Furlanello,et al.  Geographical Information Systems and Bootstrap Aggregation (Bagging) of Tree-Based Classifiers for Lyme Disease Risk Prediction in Trentino, Italian Alps , 2002, Journal of medical entomology.

[10]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[11]  S. Merler,et al.  Classification tree methods for analysis of mesoscale distribution of Ixodes ricinus (Acari:Ixodidae) in Trentino, Italian Alps. , 1996, Journal of medical entomology.

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Cesare Furlanello,et al.  Boosting of Tree-Based Classifiers for Predictive Risk Modeling in GIS , 2000, Multiple Classifier Systems.