Statistical classification techniques for photometric supernova typing

Future photometric supernova surveys will produce vastly more candidates than can be followed up spectroscopically, highlighting the need for effective classification methods based on light curves alone. Here we introduce boosting and kernel density estimation techniques which have minimal astrophysical input, and compare their performance on 20 000 simulated Dark Energy Survey light curves. We demonstrate that these methods perform very well provided a representative sample of the full population is used for training. Interestingly, we find that they do not require the redshift of the host galaxy or candidate supernova. However, training on the types of spectroscopic subsamples currently produced by supernova surveys leads to poor performance due to the resulting bias in training, and we recommend that special attention be given to the creation of representative training samples. We show that given a typical non-representative training sample, S, one can expect to pull out a representative subsample of about 10 per cent of the size of S, which is large enough to outperform the methods trained on all of S.

[1]  J. Stephen,et al.  Kernel density estimators applied to fast timing hard X-ray observations of the crab pulsar , 1991 .

[2]  M. Fukugita,et al.  The Sloan Digital Sky Survey Photometric System , 1996 .

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  M. Phillips,et al.  Observational Evidence from Supernovae for an Accelerating Universe and a Cosmological Constant , 1998, astro-ph/9805201.

[5]  É. Slezak,et al.  Density estimation with non{parametric methods ? , 1997, astro-ph/9704096.

[6]  R. Ellis,et al.  Measurements of $\Omega$ and $\Lambda$ from 42 high redshift supernovae , 1998, astro-ph/9812133.

[7]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[8]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[9]  Bohdan Paczynski,et al.  Small-telescope astronomy on global scales : IAU Colloquium 183, Proceedings of a Colloquium held in Kenting, Taiwan 4-8 January 2001 , 2001 .

[10]  R. Bacon,et al.  Overview of the Nearby Supernova Factory , 2002, SPIE Astronomical Telescopes + Instrumentation.

[11]  J. Friedman Stochastic gradient boosting , 2002 .

[12]  J. Anthony Tyson,et al.  Large Synoptic Survey Telescope: Overview , 2002, SPIE Astronomical Telescopes + Instrumentation.

[13]  J. Surdej,et al.  The XMM-LSS survey. First high redshift galaxy clusters: Relaxed and collapsing systems , 2003, astro-ph/0305192.

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  Christopher J. Miller,et al.  Galaxy ecology: groups and low-density environments in the SDSS and 2dFGRS , 2003, astro-ph/0311379.

[16]  R. Nichol,et al.  Detection of the Baryon Acoustic Peak in the Large-Scale Correlation Function of SDSS Luminous Red Galaxies , 2005, astro-ph/0501171.

[17]  T. Lauer,et al.  Observing Dark Energy , 2005 .

[18]  B. Roe,et al.  Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.

[19]  M. Langlois,et al.  Society of Photo-Optical Instrumentation Engineers , 2005 .

[20]  J. Prieto,et al.  Hubble Space Telescope and Ground-based Observations of Type Ia Supernovae at Redshift 0.5: Cosmological Implications , 2005, astro-ph/0510155.

[21]  H. Hoekstra,et al.  Very weak lensing in the CFHTLS Wide: Cosmology from cosmic shear in the linear regime , 2007, 0712.0884.

[22]  Huan Lin,et al.  A Galaxy Photometric Redshift Catalog for the Sloan Digital Sky Survey Data Release 6 , 2007, 0708.0030.

[23]  M. Sullivan,et al.  SALT2: using distant supernovae to improve the use of type Ia supernovae as distance indicators , 2007, astro-ph/0701828.

[24]  M. Kunz,et al.  Bayesian estimation applied to multiple species , 2006, astro-ph/0611004.

[25]  Alexander S. Szalay,et al.  Measuring the Baryon Acoustic Oscillation scale using the Sloan Digital Sky Survey and 2dF Galaxy Redshift Survey , 2007 .

[26]  A. Munk,et al.  Non‐parametric confidence bands in deconvolution density estimation , 2007 .

[27]  J. Kaplan,et al.  THE SLOAN DIGITAL SKY SURVEY-II SUPERNOVA SURVEY: TECHNICAL SUMMARY , 2007, 0708.2749.

[28]  Adam D. Myers,et al.  Combined analysis of the integrated Sachs-Wolfe effect and cosmological implications , 2008, 0801.4380.

[29]  J. Vanderplas,et al.  FIRST-YEAR SLOAN DIGITAL SKY SURVEY-II SUPERNOVA RESULTS: HUBBLE DIAGRAM AND COSMOLOGICAL PARAMETERS , 2009, 0908.4274.

[30]  Armin Rest,et al.  IMPROVED DARK ENERGY CONSTRAINTS FROM ∼100 NEW CfA SUPERNOVA TYPE Ia LIGHT CURVES , 2009, 0901.4804.

[31]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[32]  IfA,et al.  The Observed Growth of Massive Galaxy Clusters I: Statistical Methods and Cosmological Constraints , 2009, 0909.3098.

[33]  J. Vanderplas,et al.  First-year Sloan Digital Sky Survey-II supernova results: consistency and constraints with other intermediate-redshift data sets , 2009, 0910.2193.

[34]  Kevin Krisciunas,et al.  THE CARNEGIE SUPERNOVA PROJECT: ANALYSIS OF THE FIRST SAMPLE OF LOW-REDSHIFT TYPE-Ia SUPERNOVAE , 2009, 0910.3317.

[35]  Yago Ascasibar,et al.  Estimating multidimensional probability fields using the Field Estimator for Arbitrary Spaces (FiEstAS) with applications to astrophysics , 2010, Comput. Phys. Commun..

[36]  Alexander S. Szalay,et al.  Baryon Acoustic Oscillations in the Sloan Digital Sky Survey Data Release 7 Galaxy Sample , 2009, 0907.1660.

[37]  Jiangang Hao,et al.  ArborZ: PHOTOMETRIC REDSHIFTS USING BOOSTED DECISION TREES , 2009, The Astrophysical Journal.

[38]  Edward J. Wollack,et al.  FIVE-YEAR WILKINSON MICROWAVE ANISOTROPY PROBE OBSERVATIONS: COSMOLOGICAL INTERPRETATION , 2008, 0803.0547.