ArborZ: PHOTOMETRIC REDSHIFTS USING BOOSTED DECISION TREES

Precision photometric redshifts will be essential for extracting cosmological parameters from the next generation of wide-area imaging surveys. In this paper, we introduce a photometric redshift algorithm, ArborZ, based on the machine-learning technique of boosted decision trees. We study the algorithm using galaxies from the Sloan Digital Sky Survey (SDSS) and from mock catalogs intended to simulate both the SDSS and the upcoming Dark Energy Survey. We show that it improves upon the performance of existing algorithms. Moreover, the method naturally leads to the reconstruction of a full probability density function (PDF) for the photometric redshift of each galaxy, not merely a single "best estimate" and error, and also provides a photo-z quality figure of merit for each galaxy that can be used to reject outliers. We show that the stacked PDFs yield a more accurate reconstruction of the redshift distribution N(z). We discuss limitations of the current algorithm and ideas for future work.

[1]  Michigan.,et al.  Estimating photometric redshifts with artificial neural networks , 2002, astro-ph/0203250.

[2]  V. Narayanan,et al.  Spectroscopic Target Selection for the Sloan Digital Sky Survey: The Luminous Red Galaxy Sample , 2001, astro-ph/0108153.

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  G. Zamorani,et al.  Photometric redshifts for the CFHTLS T0004 deep and wide fields , 2008, 0811.3326.

[5]  A. Szalay,et al.  Slicing Through Multicolor Space: Galaxy Redshifts from Broadband Photometry , 1995, astro-ph/9508100.

[6]  Edwin D. Loh,et al.  A measurement of the mass density of the universe , 1986 .

[7]  Ofer Lahav,et al.  ANNz: Estimating Photometric Redshifts Using Artificial Neural Networks , 2004 .

[8]  M. Giavalisco,et al.  The Great Observatories Origins Deep Survey: Initial results from optical and near-infrared imaging , 2003, astro-ph/0309105.

[9]  R. Manmatha,et al.  Boosted decision trees for word recognition in handwritten document retrieval , 2005, SIGIR '05.

[10]  S. J. Lilly,et al.  Precision photometric redshift calibration for galaxy–galaxy weak lensing , 2007, 0709.1692.

[11]  Robert J. Brunner,et al.  Robust Machine Learning Applied to Astronomical Data Sets. II. Quantifying Photometric Redshifts for Quasars Using Instance-based Learning , 2006, astro-ph/0612471.

[12]  L. Guzzo,et al.  The Cosmic Evolution Survey (COSMOS): Overview* , 2006, astro-ph/0612305.

[13]  R. J. Brunner,et al.  The 2dF-SDSS LRG and QSO (2SLAQ) luminous red galaxy survey , 2006, astro-ph/0607631.

[14]  Huan Lin,et al.  Estimating the redshift distribution of photometric galaxy samples – II. Applications and tests of a new method , 2008, 0801.3822.

[15]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[16]  Robert J. Brunner,et al.  Robust Machine Learning Applied to Astronomical Data Sets. III. Probabilistic Photometric Redshifts for Galaxies and Quasars in the SDSS and GALEX , 2008, 0804.3413.

[17]  Jeffrey A. Newman,et al.  Calibrating Redshift Distributions beyond Spectroscopic Limits with Cross-Correlations , 2008, 0805.1409.

[18]  Huan Lin,et al.  A Galaxy Photometric Redshift Catalog for the Sloan Digital Sky Survey Data Release 6 , 2007, 0708.0030.

[19]  Paolo Conconi,et al.  Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series , 2012 .

[20]  Marc Davis,et al.  Science Objectives and Early Results of the DEEP2 Redshift Survey , 2002, SPIE Astronomical Telescopes + Instrumentation.

[21]  B. Garilli,et al.  Accurate photometric redshifts for the CFHT legacy survey calibrated using the VIMOS VLT deep survey , 2006, astro-ph/0603217.

[22]  D. Thompson,et al.  COSMOS PHOTOMETRIC REDSHIFTS WITH 30-BANDS FOR 2-deg2 , 2008, 0809.2101.

[23]  Manda Banerji,et al.  Photometric Redshifts for the Dark Energy Survey and VISTA and Implications for Large Scale Structure , 2007, 0711.1059.

[24]  J. Frieman,et al.  The Dark Energy Survey , 2020 .

[25]  D. C. Koo,et al.  Optical multicolors - A poor person's z machine for galaxies , 1985 .

[26]  G. Mcvittie Problems of extra-galactic research , 1962 .

[27]  I. Smail,et al.  The All-Wavelength Extended Groth Strip International Survey (AEGIS) Data Sets , 2006, astro-ph/0607355.

[28]  F. M. Maley,et al.  An Efficient Targeting Strategy for Multiobject Spectrograph Surveys: the Sloan Digital Sky Survey “Tiling” Algorithm , 2001, astro-ph/0105535.

[29]  Y. Wadadekar,et al.  Submitted to ApJS Preprint typeset using L ATEX style emulateapj v. 10/09/06 THE SIXTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY , 2022 .

[30]  V. Narayanan,et al.  Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Main Galaxy Sample , 2002, astro-ph/0206225.

[31]  J. Newman,et al.  The role of environment in the mass–metallicity relation , 2008, 0805.0308.

[32]  A. Fernandez-Soto,et al.  A New Catalog of Photometric Redshifts in the Hubble Deep Field , 1999 .

[33]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[34]  D. P. Schneider,et al.  The Luminosity and Color Dependence of the Galaxy Correlation Function , 2005 .

[35]  E. Spillar,et al.  Photometric Redshifts of Galaxies , 1986 .

[36]  B. Roe,et al.  Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.

[37]  J. Ostriker,et al.  Linking halo mass to galaxy luminosity , 2004, astro-ph/0402500.

[38]  B. Flaugher The Dark Energy Survey , 2005 .

[39]  PROBLEMS OF EXTRA-GALACTIC RESEARCH , 1962 .

[40]  F. Miller Maley,et al.  An Efficient Algorithm for Positioning Tiles in the Sloan Digital Sky Survey , 2001 .

[41]  Astronomy,et al.  Photometric Redshift Estimation Using Spectral Connectivity Analysis , 2009, 0906.0995.

[42]  F. Castander,et al.  The ALHAMBRA Project: A large area multi medium-band optical and NIR photometric survey , 2008, 0806.3021.

[43]  A. Montero-Dorta,et al.  The SDSS DR6 luminosity functions of galaxies , 2008, 0806.4930.

[44]  F. Tegenfeldt,et al.  TMVA - Toolkit for multivariate data analysis , 2012 .

[45]  Case Western Reserve University,et al.  Galaxy evolution from halo occupation distribution modeling of deep2 and sdss galaxy clustering , 2007, astro-ph/0703457.

[46]  D. Hogg,et al.  The kinematic origin of the cosmological redshift , 2008, 0808.1081.

[47]  Granada,et al.  Galaxies in the Hubble Ultra Deep Field. I. Detection, Multiband Photometry, Photometric Redshifts, and Morphology , 2006, astro-ph/0605262.

[48]  Huan Lin,et al.  Estimating the redshift distribution of photometric galaxy samples , 2008 .

[49]  A. Szalay,et al.  Galaxy Luminosity Functions to z~1 from DEEP2 and COMBO-17: Implications for Red Galaxy Formation , 2005, astro-ph/0506044.

[50]  A. Fontana,et al.  Photometric redshifts with the Multilayer Perceptron Neural Network: Application to the HDF-S and SDSS , 2003, astro-ph/0312064.

[51]  C. Barus AN AMERICAN JOURNAL OF PHYSICS. , 1902, Science.

[52]  G. Zamorani,et al.  The Zurich Extragalactic Bayesian Redshift Analyzer and its first application: COSMOS , 2006 .

[53]  E. al.,et al.  The Sloan Digital Sky Survey: Technical summary , 2000, astro-ph/0006396.

[54]  Mounia Lalmas,et al.  SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval , 2006 .

[55]  N. Benı́tez Bayesian Photometric Redshift Estimation , 1998, astro-ph/9811189.

[56]  R. Wechsler,et al.  THE GALAXY CONTENT OF SDSS CLUSTERS AND GROUPS , 2007, 0710.3780.

[57]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.