Gaussian Process Models of Spatial Aggregation Algorithms

Multi-level spatial aggregates are important for data mining in a variety of scientific and engineering applications, from analysis of weather data (aggregating temperature and pressure data into ridges and fronts) to performance analysis of wireless systems (aggregating simulation results into configuration space regions exhibiting particular performance characteristics). In many of these applications, data collection is expensive and time consuming, so effort must be focused on gathering samples at locations that will be most important for the analysis. This requires that we be able to functionally model a data mining algorithm in order to assess the impact of potential samples on the mining of suitable spatial aggregates. This paper describes a novel Gaussian process approach to modeling multi-layer spatial aggregation algorithms, and demonstrates the ability of the resulting models to capture the essential underlying qualitative behaviors of the algorithms. By helping cast classical spatial aggregation algorithms in a rigorous quantitative framework, the Gaussian process models support diverse uses such as directed sampling, characterizing the sensitivity of a mining algorithm to particular parameters, and understanding how variations in input data fields percolate up through a spatial aggregation hierarchy.

[1]  Chris Bailey-Kellogg,et al.  Influence-Based Model Decomposition , 1999, AAAI/IAAI.

[2]  John Skilling,et al.  Data analysis : a Bayesian tutorial , 1996 .

[3]  Feng Zhao,et al.  STA: Spatio-Temporal Aggregation with Applications to Analysis of Diffusion-Reaction Phenomena , 2000, AAAI/IAAI.

[4]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[5]  Chris Bailey-Kellogg,et al.  Influence-based model decomposition for reasoning about spatially distributed physical systems , 2001, Artif. Intell..

[6]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[7]  Feng Zhao,et al.  Relation-based aggregation: finding objects in large spatial datasets , 2000, Intell. Data Anal..

[8]  Chris Bailey-Kellogg,et al.  Spatial Aggregation: Language and Applications , 1996, AAAI/IAAI, Vol. 1.

[9]  Radford M. Neal Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification , 1997, physics/9701026.

[10]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[11]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[12]  Chris Bailey-Kellogg,et al.  Sampling strategies for mining in data-scarce domains , 2002, Computing in Science & Engineering.

[13]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[14]  Geoffrey E. Hinton,et al.  Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[15]  U. Menzefricke Hierarchical modeling with gaussian processes , 2000 .

[16]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[18]  Chris Bailey-Kellogg,et al.  Ambiguity-Directed Sampling for Qualitative Analysis of Sparse Data from Spatially-Distributed Physical Systems , 2001, IJCAI.

[19]  Feng Zhao,et al.  Spatial Aggregation: Theory and Applications , 1996, J. Artif. Intell. Res..

[20]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .