Multi-level Multi-task Learning for Modeling Cross-Scale Interactions in Nested Geospatial Data

Predictive modeling of nested geospatial data is a challenging problem as the models must take into account potential interactions among variables defined at different spatial scales. These cross-scale interactions, as they are commonly known, are particularly important to understand relationships among ecological properties at macroscales. In this paper, we present a novel, multi-level multi-task learning framework for modeling nested geospatial data in the lake ecology domain. Specifically, we consider region-specific models to predict lake water quality from multi-scaled factors. Our framework enables distinct models to be developed for each region using both its local and regional information. The framework also allows information to be shared among the region-specific models through their common set of latent factors. Such information sharing helps to create more robust models especially for regions with limited or no training data. In addition, the framework can automatically determine cross-scale interactions between the regional variables and the local variables that are nested within them. Our experimental results show that the proposed framework outperforms all the baseline methods in at least 64% of the regions for 3 out of 4 lake water quality datasets evaluated in this study. Furthermore, the latent factors can be clustered to obtain a new set of regions that is more aligned with the response variables than the original regions that were defined a priori from the ecology domain.

[1]  J. Diniz‐Filho,et al.  Spatial analysis improves species distribution modelling during range expansion , 2008, Biology Letters.

[2]  Helen M Regan,et al.  Global change and terrestrial plant community dynamics , 2016, Proceedings of the National Academy of Sciences.

[3]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[4]  Aurelie C. Lozano,et al.  Multi-level Lasso for Sparse Multi-task Regression , 2012, ICML.

[5]  Brandon T Bestelmeyer,et al.  Cross-scale interactions, nonlinearities, and forecasting catastrophic events. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Tyler Wagner,et al.  Spatial Variation in Nutrient and Water Color Effects on Lake Chlorophyll at Macroscales , 2016, PloS one.

[7]  Kendra Spence Cheruvelil,et al.  Landscape drivers of regional variation in the relationship between total phosphorus and chlorophyll in lakes , 2011 .

[8]  Lie Wang,et al.  Calibrated multivariate regression with application to neural semantic basis discovery , 2013, J. Mach. Learn. Res..

[9]  Pang-Ning Tan,et al.  Building a multi-scaled geospatial temporal ecology database from disparate data sources: fostering open science and data reuse , 2015, GigaScience.

[10]  Jieping Ye,et al.  Hierarchical Incomplete Multi-source Feature Learning for Spatiotemporal Event Forecasting , 2016, KDD.

[11]  Jiayu Zhou,et al.  Efficient multi-task feature learning with calibration , 2014, KDD.

[12]  Kendra Spence Cheruvelil,et al.  Multiscale landscape and wetland drivers of lake total phosphorus and water color , 2011 .

[13]  Jonathan V. Higgins,et al.  A Freshwater Classification Approach for Biodiversity Conservation Planning , 2005 .

[14]  P. Soranno,et al.  Cross‐scale interactions: quantifying multi‐scaled cause–effect relationships in macrosystems , 2014 .