Automating Discovery and Classification of Transients and Variable Stars in the Synoptic Survey Era

The rate of image acquisition in modern synoptic imaging surveys has already begun to outpace the feasibility of keeping astronomers in the real-time discovery and classification loop. Here we present the inner workings of a framework, based on machine-learning algorithms, that captures expert training and ground-truth knowledge about the variable and transient sky to automate (1) the process of discovery on image differences, and (2) the generation of preliminary science-type classifications of discovered sources. Since follow-up resources for extracting novel science from fast-changing transients are precious, self-calibrating classification probabilities must be couched in terms of efficiencies for discovery and purity of the samples generated. We estimate the purity and efficiency in identifying real sources with a two-epoch image-difference discovery algorithm for the Palomar Transient Factory (PTF) survey. Once given a source discovery, using machine-learned classification trained on PTF data, we distinguish between transients and variable stars with a 3.8% overall error rate (with 1.7% errors for imaging within the Sloan Digital Sky Survey footprint). At >96% classification efficiency, the samples achieve 90% purity. Initial classifications are shown to rely primarily on context-based features, determined from the data itself and external archival databases. In the first year of autonomous operations of PTF, this discovery and classification framework led to several significant science results, from outbursting young stars to subluminous Type IIP supernovae to candidate tidal disruption events. We discuss future directions of this approach, including the possible roles of crowdsourcing and the scalability of machine learning to future surveys such as the Large Synoptic Survey Telescope (LSST).

[1]  Nathaniel R. Butler,et al.  A COMPACT DEGENERATE PRIMARY-STAR PROGENITOR OF SN 2011fe , 2011, 1111.0966.

[2]  Richard Walters,et al.  REAL-TIME DETECTION AND RAPID MULTIWAVELENGTH FOLLOW-UP OBSERVATIONS OF A HIGHLY SUBLUMINOUS TYPE II-P SUPERNOVA FROM THE PALOMAR TRANSIENT FACTORY SURVEY , 2011, 1106.0400.

[3]  E. Rykoff,et al.  The ROTSE‐III Robotic Telescope System , 2002, astro-ph/0210238.

[4]  N. Wyn Evans,et al.  Light-curve classification in massive variability surveys — I. Microlensing , 2002, astro-ph/0211121.

[5]  F. Ochsenbein,et al.  The VizieR database of astronomical catalogues , 2000, astro-ph/0002122.

[6]  Yann Le Du,et al.  Lightcurve Classification in Massive Variability Surveys , 2003 .

[7]  Adam A. Miller,et al.  ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION , 2011, 1106.2832.

[8]  B. Flaugher The Dark Energy Survey , 2005 .

[9]  E. Bertin,et al.  SExtractor: Software for source extraction , 1996 .

[10]  Grzegorz Wrochna,et al.  Automated Detection of Short Optical Transients of Astrophysical Origin in Real Time , 2010 .

[11]  Richard Walters,et al.  EVIDENCE FOR AN FU ORIONIS-LIKE OUTBURST FROM A CLASSICAL T TAURI STAR , 2010, 1011.2063.

[12]  C. Aerts,et al.  Automated supervised classification of variable stars II. Application to the OGLE database , 2008, 0806.3386.

[13]  Y. Watase,et al.  Real-time difference imaging analysis of moa galactic bulge observations during 2000 , 2001 .

[14]  Zeljko Ivezic,et al.  SDSS, LSST and Gaia: Lessons and Synergies , 2011, 1102.1116.

[15]  Austin B. Tomaney,et al.  Expanding the Realm of Microlensing Surveys with Difference Image Photometry , 1996 .

[16]  Krzysztof Podgórski Advances in Machine Learning and Data Mining for Astronomy edited by Michael J. Way, Jeffrey D. Scargle, Kamal M. Ali, and Ashok N. Srivstava , 2014 .

[17]  Mansi M. Kasliwal,et al.  HUBBLE SPACE TELESCOPE STUDIES OF NEARBY TYPE Ia SUPERNOVAE: THE MEAN MAXIMUM LIGHT ULTRAVIOLET SPECTRUM AND ITS DISPERSION , 2010, 1010.2211.

[18]  S. Bailey,et al.  How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging , 2006, 0705.0493.

[19]  J. Richards,et al.  ON MACHINE-LEARNED CLASSIFICATION OF VARIABLE STARS WITH SPARSE AND NOISY TIME-SERIES DATA , 2011, 1101.1959.

[20]  J. X. Prochaska,et al.  NEW OBSERVATIONS OF THE VERY LUMINOUS SUPERNOVA 2006gy: EVIDENCE FOR ECHOES , 2009, 0906.2201.

[21]  Nathaniel R. Butler,et al.  OPTIMAL TIME-SERIES SELECTION OF QUASARS , 2010, 1008.3143.

[22]  Ryan Chornock,et al.  Observed Fractions of Core-Collapse Supernova Types and Initial Masses of their Single and Binary Progenitor Stars , 2010, 1006.3899.

[23]  Ernest E. Croner,et al.  The Palomar Transient Factory: System Overview, Performance, and First Results , 2009, 0906.5350.

[24]  Nathaniel R. Butler,et al.  Exclusion of a luminous red giant as a companion star to the progenitor of supernova SN 2011fe , 2011, Nature.

[25]  S. C. Keller,et al.  The SkyMapper Telescope and The Southern Sky Survey , 2007, Publications of the Astronomical Society of Australia.

[26]  Richard Walters,et al.  Discovery of Three New Supernova by the Palomar Transient Factory , 2010 .

[27]  A. J. Drake,et al.  FIRST RESULTS FROM THE CATALINA REAL-TIME TRANSIENT SURVEY , 2008, 0809.1394.

[28]  Bernard Muschielok,et al.  The 4MOST instrument concept overview , 2014, Astronomical Telescopes and Instrumentation.

[29]  L. M. Sarro,et al.  Automatic classification of eclipsing binaries light curves using neural networks , 2005, astro-ph/0511346.

[30]  L. M. Sarro,et al.  Automated supervised classification of variable stars - I. Methodology , 2007, 0711.0703.

[31]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[32]  Federica B. Bianco,et al.  Supernova SN 2011fe from an exploding carbon–oxygen white dwarf star , 2011, Nature.

[33]  Nathaniel R. Butler,et al.  PTF10nvg: AN OUTBURSTING CLASS I PROTOSTAR IN THE PELICAN/NORTH AMERICAN NEBULA , 2010, 1011.2565.

[34]  R. Nichol,et al.  Distributions of Galaxy Spectral Types in the Sloan Digital Sky Survey , 2004, astro-ph/0407061.

[35]  Alasdair Allan,et al.  An autonomous adaptive scheduling agent for period searching , 2008 .

[36]  M. Sullivan,et al.  THE SUBLUMINOUS AND PECULIAR TYPE Ia SUPERNOVA PTF 09dav , 2011, 1103.1797.

[37]  Oxford,et al.  Exploring the Optical Transient Sky with the Palomar Transient Factory , 2009, 0906.5355.

[38]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[39]  M. Blanton,et al.  IMPROVED BACKGROUND SUBTRACTION FOR THE SLOAN DIGITAL SKY SURVEY IMAGES , 2011, 1105.1960.

[40]  Marco Bonati,et al.  The Automated Palomar 60 Inch Telescope , 2006, astro-ph/0608323.

[41]  Linhua Jiang,et al.  LIGHT CURVE TEMPLATES AND GALACTIC DISTRIBUTION OF RR LYRAE STARS FROM SLOAN DIGITAL SKY SURVEY STRIPE 82 , 2009, 0910.4611.