System architectural design of multiwavelength data mining

Data avalanche faced in astronomy, astronomical data covers from radio, infrared, optical, X-ray, even gamma ray band. Astronomy enters an all sky-survey era. Transforming data into knowledge depends on data mining techniques. How to effectively and efficiently extract knowledge from databases is an important issue. Especially mining knowledge from different bands or multiband is of great significance. In this paper, we design a system which includes four fundamental blocks: the first is used to create databases; the second for cross-matching objects from different bands, the third for mining knowledge from the large data volume and the last one for final result evaluation. The functionalities of the four blocks are described. The cross-match results are divided, and the analysis mode for each of them is touched upon. Moreover the schemes of classification, regression, clustering analysis and outlier detection are demonstrated.

[1]  Padhraic Smyth,et al.  Knowledge Discovery and Data Mining: Towards a Unifying Framework , 1996, KDD.

[2]  Yong-Heng Zhao,et al.  Two Novel Approaches for Photometric Redshift Estimation based on SDSS and 2MASS , 2007, 0707.2250.

[3]  Yanxia Zhang,et al.  Automated clustering algorithms for classification of astronomical objects , 2004, astro-ph/0403431.

[4]  Roger L. Wainwright,et al.  Detecting multiple outliers in regression data using genetic algorithms , 1995, SAC '95.

[5]  Yanxia Zhang,et al.  Classification of AGNs from stars and normal galaxies by support vector machines , 2002, SPIE Astronomical Telescopes + Instrumentation.

[6]  Y. Zhang,et al.  Decision table for classifying point sources based on FIRST and 2MASS databases , 2008 .

[7]  H. Zheng,et al.  Feature selection for high dimensional data in astronomy , 2007, 0709.0138.

[8]  Yong-Heng Zhao,et al.  Outlier detection in astronomical data , 2004, SPIE Astronomical Telescopes + Instrumentation.

[9]  Yanxia Zhang,et al.  A system integrated with query, cross-matching, and visualization , 2006, SPIE Astronomical Telescopes + Instrumentation.

[10]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[11]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[12]  Yanxia Zhang,et al.  k-Nearest Neighbors for automated classification of celestial objects , 2008 .

[13]  Yong-Heng Zhao,et al.  Estimating Photometric Redshifts with Artificial Neural Networks and Multi-Parameters , 2007 .

[14]  Yong-Heng Zhao,et al.  An automated classification algorithm for multiwavelength data , 2004, SPIE Astronomical Telescopes + Instrumentation.

[15]  Y. Zhao,et al.  Comparison of decision tree methods for finding active objects , 2007, 0708.4274.

[16]  Yong-Heng Zhao,et al.  Learning Vector Quantization for Classifying Astronomical Objects , 2003 .

[17]  Yong-Heng Zhao,et al.  A Comparison of BBN, ADTree and MLP in separating Quasars from Large Survey Catalogues , 2007 .

[18]  Mark Last Automated Detection of Outliers in Real-World Data , 2001 .

[19]  Hans-Peter Kriegel,et al.  OPTICS-OF: Identifying Local Outliers , 1999, PKDD.

[20]  Yong-Heng Zhao,et al.  Support vector machines and kd-tree for separating quasars from large survey data bases , 2008 .

[21]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[22]  Yong-Heng Zhao,et al.  Classification in Multidimensional Parameter Space: Methods and Examples , 2003 .

[23]  Y. X. Zhang,et al.  Kernel regression for determining photometric redshifts from Sloan broad‐band photometry , 2007, 0706.2704.

[24]  Dwl Cheung,et al.  Parallel Algorithm for Mining Outliers in Large Database , 1999 .

[25]  Zbigniew R. Struzik,et al.  Outlier detection and localisation with wavelet based multifractal formalism , 2000 .

[26]  Yanxia Zhang,et al.  The Application of kd-tree in Astronomy , 2008 .

[27]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .