ADaM: a data mining toolkit for scientists and engineers

Algorithm Development and Mining (ADaM) is a data mining toolkit designed for use with scientific data. It provides classification, clustering and association rule mining methods that are common to many data mining systems. In addition, it provides feature reduction capabilities, image processing, data cleaning and preprocessing capabilities that are of value when mining scientific data. The toolkit is packaged as a suite of independent components, which are designed to work in grid and cluster environments. The toolkit is extensible and scalable, and has been successfully used in several diverse data mining applications. ADaM has also been used in conjunction with other data mining toolkits and with point tools. This paper presents the architecture and design of the ADaM toolkit and discusses its application in detecting cumulus cloud fields in satellite imagery.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Rahul Ramachandran,et al.  Interchange technology for applications to facilitate generic access to heterogenous data formats , 2002, IEEE International Geoscience and Remote Sensing Symposium.

[3]  Fernando Pellon de Miranda,et al.  The semivariogram in comparison to the co-occurrence matrix for classification of image texture , 1998, IEEE Trans. Geosci. Remote. Sens..

[4]  Sara J. Graves,et al.  Using Association Rules as Texture Features , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[6]  U.S. Nair,et al.  Impact of land use on Costa Rican regional climate , 2000, IGARSS 2000. IEEE 2000 International Geoscience and Remote Sensing Symposium. Taking the Pulse of the Planet: The Role of Remote Sensing in Managing the Environment. Proceedings (Cat. No.00CH37120).

[7]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[8]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[9]  Udaysankar S. Nair Impact of land surface heterogeneity on the spatial organization of cumulus clouds , 2001 .

[10]  I. Foster,et al.  The grid: computing without bounds. , 2003, Scientific American.

[11]  James F. Greenleaf,et al.  Use of gray value distribution of run lengths for texture analysis , 1990, Pattern Recognit. Lett..

[12]  James R. Carr Data Visualization in the Geosciences , 2002 .