Content-based search and clustering of remote sensing imagery

The increasing amount of imagery to be collected by the Earth Observing System Data Information System (EOSDIS) emphasizes the need for intelligent retrieval infrastructures which enable model fitting and hypothesis testing on a very large scale, rather than on a small subset of the available data. The authors' work addresses two strategic challenges: data streaming and organization. The first involves the reduction of raw multispectral image data into attribute terms, which quantify a number of parameters of scientific interest and their temporal evolution at different spatial scales. To this end, they are developing algorithms for feature extraction which hold great promise for the automatic categorization of large collections of images. The second challenge is to provide the technology which can organize the extracted features and turn them into information. The solution requires the customization and embedding of sophisticated statistical algorithms in a database management system. Examples are agglomerative or divisive clustering of database attributes, multivariate techniques for inspecting the dependence of any attribute on two or more other attributes, and classification and regression trees for discovering hidden relationships among attributes. The emerging field of knowledge discovery in databases (KDD) has spawned a renaissance in statistical computing and has much to offer to the remote sensing community, if they are to make the best use of the volume of the new measurements.