Machine Learning for the Zwicky Transient Facility

The Zwicky Transient Facility is a large optical survey in multiple filters producing hundreds of thousands of transient alerts per night. We describe here various machine learning (ML) implementations and plans to make the maximal use of the large data set by taking advantage of the temporal nature of the data, and further combining it with other data sets. We start with the initial steps of separating bogus candidates from real ones, separating stars and galaxies, and go on to the classification of real objects into various classes. Besides the usual methods (e.g., based on features extracted from light curves) we also describe early plans for alternate methods including the use of domain adaptation, and deep learning. In a similar fashion we describe efforts to detect fast moving asteroids. We also describe the use of the Zooniverse platform for helping with classifications through the creation of training samples, and active learning. Finally we mention the synergistic aspects of ZTF and LSST from the ML perspective.

Umaa Rebbapragada | Richard Walters | Adam A. Miller | Richard Dekany | Ashish Mahabal | Reed Riddle | Rahul Biswas | Thomas A. Prince | Scott Adams | V. Zach Golkhou | Mansi M. Kasliwal | Dmitry A. Duev | John Parejko | Suvi Gezari | Sjoert van Velzen | Tiara Hung | Chris Lintott | Darryl Wright | Steven Groom | Eric C. Bellm | Frank J. Masci | Brian Bue | Matthew Graham | Chris Cannella | Quan-Zhi Ye | David L. Shupe | Paula Szkody | Nima Sedaghat | Ulrich Feindt | Nadejda Blagorodnova | Jan van Roestel | Kevin Burdge | Chan-Kao Chang | Jakob Nordin | Charlotte Ward | Doug Branton | Andrew Connolly | Lucy Fortson | Sara Frederick | C. Fremling | Shrinivas Kulkarni | Thomas Kupfer | Hsing Wen Lin | Ragnhild Lunnan | Ben Rusholme | Nicholas Saunders | Leo P. Singer | Maayane T. Soumagnac | Yutaro Tachibana | Kushal Tirumala | A. Mahabal | M. Graham | S. Gezari | C. Lintott | A. Connolly | J. Parejko | L. Singer | R. Biswas | M. Soumagnac | M. Kasliwal | S. Kulkarni | E. Bellm | P. Szkody | U. Feindt | J. Nordin | R. Dekany | R. Lunnan | B. Bue | C. Fremling | F. Masci | U. Rebbapragada | R. Walters | N. Blagorodnova | J. Roestel | Q. Ye | K. Burdge | Chan-Kao Chang | D. Duev | V. Golkhou | C. Ward | S. Adams | D. Branton | C. Cannella | T. Hung | L. Fortson | S. Frederick | S. Groom | T. Kupfer | Hsing-Wen Lin | T. Prince | R. Riddle | B. Rusholme | Nicholas Saunders | N. Sedaghat | D. Shupe | Y. Tachibana | K. Tirumala | S. Velzen | D. Wright | S. Kulkarni | Nima Sedaghat | Kushal Tirumala | S. Kulkarni

[1]  N. Lomb Least-squares frequency analysis of unequally spaced data , 1976 .

[2]  J. Scargle Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data , 1982 .

[3]  A. Schwarzenberg-Czerny On the advantage of using analysis of variance for period search. , 1989 .

[4]  G. Kov'acs,et al.  A box-fitting algorithm in the search for periodic transits , 2002, astro-ph/0206099.

[5]  R. Bacon,et al.  Overview of the Nearby Supernova Factory , 2002, SPIE Astronomical Telescopes + Instrumentation.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Nick Kaiser,et al.  Pan-STARRS: a wide-field optical survey telescope array , 2004, SPIE Astronomical Telescopes + Instrumentation.

[8]  S. Bailey,et al.  How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging , 2006, 0705.0493.

[9]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[10]  Cea,et al.  Weak Gravitational Lensing with COSMOS: Galaxy Selection and Shape Measurements , 2007, astro-ph/0702359.

[11]  S. G. Djorgovski,et al.  The Palomar-Quest digital synoptic sky survey , 2007, 0801.3005.

[12]  C. Lintott,et al.  Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey , 2008, 0804.4483.

[13]  J. Bloom,et al.  Towards a Real-time Transient Classification Engine , 2008, 0802.2249.

[14]  Ernest E. Croner,et al.  The Palomar Transient Factory: System Overview, Performance, and First Results , 2009, 0906.5350.

[15]  A. J. Drake,et al.  FIRST RESULTS FROM THE CATALINA REAL-TIME TRANSIENT SURVEY , 2008, 0809.1394.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  J. Richards,et al.  ON MACHINE-LEARNED CLASSIFICATION OF VARIABLE STARS WITH SPARSE AND NOISY TIME-SERIES DATA , 2011, 1101.1959.

[18]  R. Wainscoat,et al.  Improved Asteroid Astrometry and Photometry with Trail Fitting , 2012, 1209.6106.

[19]  Adam A. Miller,et al.  ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION , 2011, 1106.2832.

[20]  J. Prieto,et al.  THE MAN BEHIND THE CURTAIN: X-RAYS DRIVE THE UV THROUGH NIR VARIABILITY IN THE 2013 ACTIVE GALACTIC NUCLEUS OUTBURST IN NGC 2617 , 2013, 1310.2241.

[21]  S. Djorgovski,et al.  Using conditional entropy to identify periodicity , 2013, 1306.6664.

[22]  M. Wainwright,et al.  Using machine learning for discovery in synoptic survey imaging data , 2012, 1209.3775.

[23]  Gautham Narayan,et al.  ANTARES: a prototype transient broker system , 2014, Astronomical Telescopes and Instrumentation.

[24]  Yi Wang,et al.  Modeling Light Curves for Improved Classification , 2014 .

[25]  Sergey E. Koposov,et al.  THE CATALINA SURVEYS PERIODIC VARIABLE STAR CATALOG , 2014, 1405.4290.

[26]  S. Kulkarni,et al.  313 NEW ASTEROID ROTATION PERIODS FROM PALOMAR TRANSIENT FACTORY OBSERVATIONS , 2014, 1405.1144.

[27]  S. Kulkarni,et al.  ASTEROID SPIN-RATE STUDY USING THE INTERMEDIATE PALOMAR TRANSIENT FACTORY , 2015, 1506.08493.

[28]  E. Ofek,et al.  ASTEROID LIGHT CURVES FROM THE PALOMAR TRANSIENT FACTORY SURVEY: ROTATION PERIODS AND PHASE FUNCTIONS FROM SPARSE PHOTOMETRY , 2015, 1504.04041.

[29]  R. Kotak,et al.  Machine learning for transient discovery in Pan-STARRS1 difference imaging , 2015, 1501.05470.

[30]  C. Bailer-Jones,et al.  A package for the automated classification of periodic variable stars , 2015, 1512.01611.

[31]  Pavlos Protopapas,et al.  FATS: Feature Analysis for Time Series , 2015, 1506.00010.

[32]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[33]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[34]  R. C. Wolf,et al.  AUTOMATED TRANSIENT IDENTIFICATION IN THE DARK ENERGY SURVEY , 2015, 1504.02936.

[35]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  E. Bellm Volumetric Survey Speed: A Figure of Merit for Transient Surveys , 2016, 1605.02081.

[38]  E. Ofek,et al.  PROPER IMAGE SUBTRACTION—OPTIMAL TRANSIENT DETECTION, PHOTOMETRY, AND HYPOTHESIS TESTING , 2016, 1601.02655.

[39]  Observatoire de la Côte d'Azur,et al.  Gaia Data Release 1. Summary of the astrometric, photometric, and survey properties , 2016, 1609.04172.

[40]  Tom Charnock,et al.  Deep Recurrent Neural Networks for Supernovae Classification , 2016, ArXiv.

[41]  W. M. Wood-Vasey,et al.  The Pan-STARRS1 Surveys , 2016, 1612.05560.

[42]  S. Kulkarni,et al.  Small Near-Earth Asteroids in the Palomar Transient Factory Survey: A Real-Time Streak-detection System , 2016, 1609.08018.

[43]  Po-Hsuan Huang,et al.  Distributed asteroid discovery system for large astronomical data , 2017, J. Netw. Comput. Appl..

[44]  Brett Naul,et al.  A recurrent neural network for classification of unevenly sampled variable stars , 2017, Nature Astronomy.

[45]  Caltech,et al.  PREPARING FOR ADVANCED LIGO: A STAR–GALAXY SEPARATION CATALOG FOR THE PALOMAR TRANSIENT FACTORY , 2017, 1703.07356.

[46]  Z. T. Spetsieri,et al.  Comparative performance of selected variability detection techniques in photometric time series data , 2016, 1609.01716.

[47]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  William H. Lee,et al.  RATIR Follow-up of LIGO/Virgo Gravitational Wave Events , 2017, 1706.03898.

[49]  Ashish Mahabal,et al.  Effective Image Differencing with ConvNets for Real-time Transient Hunting , 2017, ArXiv.

[50]  T. A. Lister,et al.  Gaia Data Release 2. Summary of the contents and survey properties , 2018, 1804.09365.

[51]  B. Stalder,et al.  ATLAS: A High-cadence All-sky Survey System , 2018, 1802.00879.

[52]  N. Mowlavi,et al.  Gaia Data Release 2 , 2018, Astronomy & Astrophysics.

[53]  J. Prieto,et al.  The ASAS-SN catalogue of variable stars I: The Serendipitous Survey , 2018, 1803.01001.

[54]  Christian Arnault,et al.  Analyzing astronomical data with Apache Spark , 2018, ArXiv.

[55]  Matthew J. Graham,et al.  The Zwicky Transient Facility Alert Distribution System , 2018, Publications of the Astronomical Society of the Pacific.

[56]  K. Sokolovsky,et al.  Machine learning search for variable stars , 2017, 1710.07290.

[57]  Umaa Rebbapragada,et al.  The Zwicky Transient Facility: System Overview, Performance, and First Results , 2018, Publications of the Astronomical Society of the Pacific.

[58]  Tom Heskes,et al.  Bigger Buffer k-d Trees on Multi-Many-Core Systems , 2018, VECPAR.

[59]  Umaa Rebbapragada,et al.  The Zwicky Transient Facility: Data Processing, Products, and Archive , 2018, Publications of the Astronomical Society of the Pacific.

[60]  A. Miller,et al.  A Morphological Classification Model to Identify Unresolved PanSTARRS1 Sources: Application in the ZTF Real-time Pipeline , 2018, Publications of the Astronomical Society of the Pacific.

[61]  C. Scheidegger,et al.  Machine-learning-based Brokers for Real-time Classification of the LSST Alert Stream , 2018, 1801.07323.

[62]  A. Mahabal,et al.  Optimizing spectroscopic follow-up strategies for supernova photometric classification with active learning , 2018, Monthly Notices of the Royal Astronomical Society.

[63]  S. Smartt,et al.  A First Catalog of Variable Stars Measured by the Asteroid Terrestrial-impact Last Alert System (ATLAS) , 2018, The Astronomical Journal.

[64]  P. J. Richards,et al.  Gaia Data Release 2 , 2018, Astronomy & Astrophysics.

[65]  Eduardo Serrano,et al.  LSST: From Science Drivers to Reference Design and Anticipated Data Products , 2008, The Astrophysical Journal.

[66]  Gaia Data Release 2 , 2018, Astronomy & Astrophysics.