Robust Machine Learning Applied to Astronomical Data Sets. II. Quantifying Photometric Redshifts for Quasars Using Instance-based Learning

We apply instance-based machine learning in the form of a k-nearest neighbor algorithm to the task of estimating photometric redshifts for 55,746 objects spectroscopically classified as quasars in the Fifth Data Release of the Sloan Digital Sky Survey. We compare the results obtained to those from an empirical color-redshift relation (CZR). In contrast to previously published results using CZRs, we find that the instance-based photometric redshifts are assigned with no regions of catastrophic failure. Remaining outliers are simply scattered about the ideal relation, in a manner similar to the pattern seen in the optical for normal galaxies at redshifts z 1. The instance-based algorithm is trained on a representative sample of the data and pseudo-blind-tested on the remaining unseen data. The variance between the photometric and spectroscopic redshifts is σ2 = 0.123 ± 0.002 (compared to σ2 = 0.265 ± 0.006 for the CZR), and 54.9% ± 0.7%, 73.3% ± 0.6%, and 80.7% ± 0.3% of the objects are within Δz < 0.1, 0.2, and 0.3, respectively. We also match our sample to the Second Data Release of the Galaxy Evolution Explorer legacy data, and the resulting 7642 objects show a further improvement, giving a variance of σ2 = 0.054 ± 0.005, with 70.8% ± 1.2%, 85.8% ± 1.0%, and 90.8% ± 0.7% of objects within Δz < 0.1, 0.2, and 0.3. We show that the improvement is indeed due to the extra information provided by GALEX, by training on the same data set using purely SDSS photometry, which has a variance of σ2 = 0.090 ± 0.007. Each set of results represents a realistic standard for application to further data sets for which the spectra are representative.

[1]  Y. Wadadekar Estimating Photometric Redshifts Using Support Vector Machines , 2004, astro-ph/0412005.

[2]  D. C. Koo,et al.  Optical multicolors - A poor person's z machine for galaxies , 1985 .

[3]  A. Cimatti,et al.  A catalogue of the Chandra Deep Field South with multi-colour classification and photometric redshifts from COMBO-17 , 2004, astro-ph/0403666.

[4]  Edwin L. Turner,et al.  A Catalog of Color-based Redshift Estimates for Z <~ 4 Galaxies in the Hubble Deep Field , 1998 .

[5]  D. Wake,et al.  MegaZ-LRG:a photometric redshift catalogue of one million SDSS luminous red galaxies , 2006, astro-ph/0607630.

[6]  Ofer Lahav,et al.  ANNz: Estimating Photometric Redshifts Using Artificial Neural Networks , 2004 .

[7]  Oxford,et al.  The 2dF QSO Redshift Survey – XII. The spectroscopic catalogue and luminosity function , 2004, astro-ph/0403040.

[8]  S. Dye,et al.  The evolution of faint AGN between z ' 1 and z ' 5 from the COMBO-17 survey , 2003 .

[9]  Robert J. Brunner,et al.  Robust Machine Learning Applied to Astronomical Data Sets. I. Star-Galaxy Classification of the Sloan Digital Sky Survey DR3 Using Decision Trees , 2006, astro-ph/0606541.

[10]  Massimo Stiavelli,et al.  The Hubble Ultra Deep Field , 2003, astro-ph/0607632.

[11]  Alexander G. Gray,et al.  First Measurement of the Clustering Evolution of Photometrically Classified Quasars , 2005, astro-ph/0510371.

[12]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[13]  Alexander S. Szalay,et al.  Toward More Precise Photometric Redshifts: Calibration via CCD Photometry , 1997, astro-ph/9703058.

[14]  Alexander S. Szalay,et al.  Photometric redshifts from reconstructed quasar templates , 2001 .

[15]  Massimo Stiavelli,et al.  The Hubble Deep Field South: Formulation of the observing campaign , 2000 .

[16]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[17]  A Blind Test of Photometric Redshift Prediction , 1998, astro-ph/9801133.

[18]  Granada,et al.  Galaxies in the Hubble Ultra Deep Field. I. Detection, Multiband Photometry, Photometric Redshifts, and Morphology , 2006, astro-ph/0605262.

[19]  The nature of the faint galaxies in the Hubble Deep Field , 1996, astro-ph/9604118.

[20]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[21]  Chang Wook Ahn,et al.  On the practical genetic algorithms , 2005, GECCO '05.

[22]  David E. Goldberg,et al.  The Design of Innovation: Lessons from and for Competent Genetic Algorithms , 2002 .

[23]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[24]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[25]  Fermilab,et al.  Photometric Redshifts of Quasars , 2001, astro-ph/0106038.

[26]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[27]  ROBERT E. Williams,et al.  The Hubble Deep Field: Observations, Data Reduction, and , 1996, astro-ph/9607174.

[28]  N. Benı́tez Bayesian Photometric Redshift Estimation , 1998, astro-ph/9811189.

[29]  R. J. Brunner,et al.  The 2dF-SDSS LRG and QSO (2SLAQ) luminous red galaxy survey , 2006, astro-ph/0607631.

[30]  A. Szalay,et al.  The Galaxy Evolution Explorer: A Space Ultraviolet Survey Mission , 2004, astro-ph/0411302.

[31]  A. Fontana,et al.  Photometric redshifts with the Multilayer Perceptron Neural Network: Application to the HDF-S and SDSS , 2003, astro-ph/0312064.

[32]  E. Wright,et al.  The Spitzer Space Telescope Mission , 2004, astro-ph/0406223.

[33]  J. Mathis,et al.  The relationship between infrared, optical, and ultraviolet extinction , 1989 .

[34]  R. Nichol,et al.  An Empirical Calibration of the Completeness of the SDSS Quasar Survey , 2005, astro-ph/0501113.

[35]  E. al.,et al.  The Sloan Digital Sky Survey: Technical summary , 2000, astro-ph/0006396.

[36]  A. Myers,et al.  Clustering Analyses of 300,000 Photometrically Classified Quasars. II. The Excess on Very Small Scales , 2006, astro-ph/0612191.

[37]  Alberto Fernández-Soto,et al.  Star-forming galaxies at very high redshifts , 1996, Nature.

[38]  E. Spillar,et al.  Photometric Redshifts of Galaxies , 1986 .

[39]  S. Okamura,et al.  Galaxy types in the Sloan Digital Sky survey using supervised artificial neural networks , 2003, astro-ph/0306390.

[40]  M. Skrutskie,et al.  The Two Micron All Sky Survey (2MASS) , 2006 .

[41]  R. Nichol,et al.  The Application of Photometric Redshifts to the SDSS Early Data Release , 2002, astro-ph/0211080.

[42]  M. Irwin,et al.  ImpZ: a new photometric redshift code for galaxies and quasars , 2004, astro-ph/0406296.

[43]  David E. Goldberg Design of Competent Genetic Algorithms , 2002 .

[44]  Tamas Budavari,et al.  An Empirical Algorithm for Broadband Photometric Redshifts of Quasars from the Sloan Digital Sky Survey , 2004, astro-ph/0408504.

[45]  D. Schlegel,et al.  Maps of Dust Infrared Emission for Use in Estimation of Reddening and Cosmic Microwave Background Radiation Foregrounds , 1998 .

[46]  V. Narayanan,et al.  Spectroscopic Target Selection for the Sloan Digital Sky Survey: The Luminous Red Galaxy Sample , 2001, astro-ph/0108153.

[47]  Alexander S. Szalay,et al.  Calibrating photometric redshifts of luminous red galaxies , 2005 .

[48]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[49]  A. Connolly,et al.  Evolution of the Angular Correlation Function , 1998, astro-ph/9803047.

[50]  S. Gwyn,et al.  The Redshift Distribution and Luminosity Functions of Galaxies in the Hubble Deep Field , 1996, astro-ph/9603149.

[51]  A. Szalay,et al.  Evolution in the Clustering of Galaxies for z < 1.0 , 1999, astro-ph/9907403.

[52]  John Holland,et al.  Adaptation in Natural and Artificial Sys-tems: An Introductory Analysis with Applications to Biology , 1975 .

[53]  Alexander S. Szalay,et al.  Sloan digital sky survey: Early data release , 2002 .

[54]  M. Way,et al.  Novel Methods for Predicting Photometric Redshifts from Broadband Photometry Using Virtual Sensors , 2006 .

[55]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[56]  Clustering analyses of 300,000 photometrically classified quasars. I. Luminosity and redshift evolution in quasar bias , 2006, astro-ph/0612190.

[57]  Michigan.,et al.  Estimating photometric redshifts with artificial neural networks , 2002, astro-ph/0203250.

[58]  D. Schlegel,et al.  Maps of Dust IR Emission for Use in Estimation of Reddening and CMBR Foregrounds , 1997, astro-ph/9710327.

[59]  H. Lin,et al.  Evolution of the Galaxy Population Based on Photometric Redshifts in the Hubble Deep Field , 1997 .