Dealing with Data Corruption in Remote Sensing

Remote sensing has resulted in repositories of data that grow at a pace much faster than can be readily analyzed. One of the obstacles in dealing with remotely sensed data and others is the variable quality of the data. Instrument failures can result in entire missing observation cycles, while cloud cover frequently results in missing or distorted values. We investigated the use of several methods that automatically deal with corruptions in the data. These include robust measures which avoid overfitting, filtering which discards the corrupted instances, and polishing by which the corrupted elements are fitted with more appropriate values. We applied such methods to a data set of vegetation indices and land cover type assembled from NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) data collection.

[1]  Noel A Cressie,et al.  Statistics for Spatial Data, Revised Edition. , 1994 .

[2]  Saeed V. Vaseghi,et al.  Advanced Digital Signal Processing and Noise Reduction: Vaseghi: Advanced , 2001 .

[3]  R. Lunetta,et al.  Remote sensing and Geographic Information System data integration: error sources and research issues , 1991 .

[4]  Saeed V. Vaseghi,et al.  Advanced Digital Signal Processing and Noise Reduction , 2006 .

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  George H. John Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[7]  George Drastal Informed Pruning in Constructive Induction , 1991, ML.

[8]  Nada Lavrac,et al.  Experiments with Noise Filtering in a Medical Domain , 1999, ICML.

[9]  Choh-Man Teng,et al.  Correcting Noisy Data , 1999, ICML.

[10]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[11]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[12]  A. Huete,et al.  A feedback based modification of the NDVI to minimize canopy background and atmospheric noise , 1995 .

[13]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[14]  Andrian Marcus,et al.  Data Cleansing: Beyond Integrity Analysis 1 , 2000 .

[15]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[16]  Ram M. Narayanan,et al.  Noise estimation in remote sensing imagery using data masking , 2003 .

[17]  Compton J. Tucker,et al.  Mean and inter-year variation of growing-season normalized difference vegetation index for the Sahel 1981-1989 , 1991 .

[18]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .