RANDOM FORESTS FOR PHOTOMETRIC REDSHIFTS

The main challenge today in photometric redshift estimation is not in the accuracy but in understanding the uncertainties. We introduce an empirical method based on Random Forests to address these issues. The training algorithm builds a set of optimal decision trees on subsets of the available spectroscopic sample, which provide independent constraints on the redshift of each galaxy. The combined forest estimates have intriguing statistical properties, notable among which are Gaussian errors. We demonstrate the power of our approach on multi-color measurements of the Sloan Digital Sky Survey.

[1]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[2]  H. Lin,et al.  Evolution of the Galaxy Population Based on Photometric Redshifts in the Hubble Deep Field , 1997 .

[3]  G. Bruzual,et al.  Stellar population synthesis at the resolution of 2003 , 2003, astro-ph/0309134.

[4]  A. Fernandez-Soto,et al.  A New Catalog of Photometric Redshifts in the Hubble Deep Field , 1999 .

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  N. Benı́tez Bayesian Photometric Redshift Estimation , 1998, astro-ph/9811189.

[7]  J. Frieman,et al.  Photometric Redshift Error Estimators , 2007, 0711.0962.

[8]  D. Weedman,et al.  Colors and magnitudes predicted for high redshift galaxies , 1980 .

[9]  A. J. Connolly,et al.  Reconstructing Galaxy Spectral Energy Distributions from Broadband Photometry , 1999, astro-ph/9910389.

[10]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[11]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[12]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[13]  E. al.,et al.  The Sloan Digital Sky Survey: Technical summary , 2000, astro-ph/0006396.

[14]  Tamas Budavari,et al.  A UNIFIED FRAMEWORK FOR PHOTOMETRIC REDSHIFTS , 2008, 0811.2600.

[15]  Huan Lin,et al.  A Galaxy Photometric Redshift Catalog for the Sloan Digital Sky Survey Data Release 6 , 2007, 0708.0030.

[16]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[17]  D. C. Koo,et al.  Optical multicolors - A poor person's z machine for galaxies , 1985 .

[18]  J. Gunn,et al.  A New Technique for Galaxy Photometric Redshifts in the Sloan Digital Sky Survey , 2007, 0707.3443.

[19]  V. Narayanan,et al.  Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Main Galaxy Sample , 2002, astro-ph/0206225.

[20]  Edwin L. Turner,et al.  A Catalog of Color-based Redshift Estimates for Z <~ 4 Galaxies in the Hubble Deep Field , 1998 .

[21]  Ofer Lahav,et al.  ANNz: Estimating Photometric Redshifts Using Artificial Neural Networks , 2004 .

[22]  S. Roweis,et al.  An Improved Photometric Calibration of the Sloan Digital Sky Survey Imaging Data , 2007, astro-ph/0703454.

[23]  S. Gwyn,et al.  The Redshift Distribution and Luminosity Functions of Galaxies in the Hubble Deep Field , 1996, astro-ph/9603149.

[24]  Y. Wadadekar,et al.  Submitted to ApJS Preprint typeset using L ATEX style emulateapj v. 10/09/06 THE SIXTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY , 2022 .

[25]  The Statistical Approach to Quantifying Galaxy Evolution , 1998, astro-ph/9812104.

[26]  A. Szalay,et al.  Slicing Through Multicolor Space: Galaxy Redshifts from Broadband Photometry , 1995, astro-ph/9508100.