Aerosol Optical Depth Prediction from Satellite Obsercations by Multiple Instance Regression

Aerosols are small airborne particles that both reflect and absorb incoming solar radiation and whose effect on the Earth’s radiation budget is one of the biggest challenges of current climate research. To help address this challenge, numerous satellite sensors are employed to achieve globalscale monitoring of aerosols. Given the satellite measurements, the common objective is prediction of Aerosol Optical Depth (AOD). An important property of AOD is its low spatial variability on a scale of tens of kilometers. On the other hand, satellite sensors gather information in the form of multi-spectral images with high spatial resolution where pixels could be as small as a few hundred meters. Given an accurate ground-based AOD measurement over a specific location and time, all the pixels in the vicinity can be assumed to have the same AOD. If we treat satellite measurement at a single pixel as an instance, all pixels from the neighborhood can be considered as a bag of instances labeled with the same AOD. Given a number of bags obtained at numerous locations and at different times we can treat the problem of AOD prediction from satellite attributes as Multiple Instance Regression (MIR). An important challenge is that because of rapidly changing surface properties attribute values of pixels from a bag can vary a lot. This study evaluated several MIR approaches on several synthetic data sets and on a data set consisting of 800 labeled bags, each containing hundreds of pixel instances observed over the Continental U.S. by the MISR satellite instrument. The results indicate that the most successful MIR approach consists of an iterative procedure that detects and discards outlying instances and trains a predictor on the remaining ones.

[1]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[2]  J. R. Cook,et al.  Simulation-Extrapolation Estimation in Parametric Measurement Error Models , 1994 .

[3]  Kiri L. Wagstaff,et al.  Salience Assignment for Multiple-Instance Regression , 2007 .

[4]  Oded Maron,et al.  Learning from Ambiguity , 1998 .

[5]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[6]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[8]  David Page,et al.  Multiple Instance Regression , 2001, ICML.

[9]  Raymond J. Carroll,et al.  Measurement error in nonlinear models: a modern perspective , 2006 .

[10]  Sally A. Goldman,et al.  Multiple-Instance Learning of Real-Valued Data , 2001, J. Mach. Learn. Res..

[11]  Jan Ramon,et al.  Multi instance neural networks , 2000, ICML 2000.

[12]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[13]  Razvan C. Bunescu,et al.  Multiple instance learning for sparse positive bags , 2007, ICML '07.

[14]  B. Holben,et al.  A spatio‐temporal approach for global validation and analysis of MODIS aerosol products , 2002 .

[15]  James T. Kwok,et al.  A regularization framework for multiple-instance learning , 2006, ICML.

[16]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[17]  D. Hall Measurement Error in Nonlinear Models: A Modern Perspective , 2008 .

[18]  Stephen Kwek,et al.  Real-valued multiple-instance learning with queries , 2006, J. Comput. Syst. Sci..

[19]  L. Skovgaard NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA. , 1996 .

[20]  Stephen Kwek,et al.  Real-valued multiple-instance learning with queries , 2001, J. Comput. Syst. Sci..

[21]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[22]  Mark Craven,et al.  Supervised versus multiple instance learning: an empirical comparison , 2005, ICML.

[23]  Qi Zhang,et al.  Content-Based Image Retrieval Using Multiple-Instance Learning , 2002, ICML.

[24]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[25]  Tomás Lozano-Pérez,et al.  Image database retrieval with multiple-instance learning techniques , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[26]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.