Machine-learning-based real-bogus system for the HSC-SSP moving object detection pipeline

Machine learning techniques are widely applied in many modern optical sky surveys, e.q. Pan-STARRS1, PTF/iPTF and Subaru/Hyper Suprime-Cam survey, to reduce human intervention for data verification. In this study, we have established a machine learning based real-bogus system to reject the false detections in the Subaru/Hyper-Suprime-Cam StrategicSurvey Program (HSC-SSP) source catalog. Therefore the HSC-SSP moving object detection pipeline can operate more effectively due to the reduction of false positives. To train the real-bogus system, we use the stationary sources as the real training set and the "flagged" data as the bogus set. The training set contains 47 features, most of which are photometric measurements and shape moments generated from the HSC image reduction pipeline (hscPipe). Our system can reach a true positive rate (tpr) ~96% with a false positive rate (fpr) ~ 1% or tpr ~99% at fpr ~5%. Therefore we conclude that the stationary sources are decent real training samples, and using photometry measurements and shape moments can reject the false positives effectively.

[1]  Iftach Sadeh,et al.  ANNz2: Photometric Redshift and Probability Distribution Function Estimation using Machine Learning , 2015, 1507.00490.

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Lin Yan,et al.  The IPAC Image Subtraction and Discovery Pipeline for the Intermediate Palomar Transient Factory , 2016, 1608.01733.

[4]  Massimo Brescia,et al.  METAPHOR: a machine-learning-based method for the probability density estimation of photometric redshifts , 2016, 1611.02162.

[5]  S. Bailey,et al.  How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging , 2006, 0705.0493.

[6]  R. Wainscoat,et al.  Improved Asteroid Astrometry and Photometry with Trail Fitting , 2012, 1209.6106.

[7]  Sao,et al.  A MACHINE-LEARNING METHOD TO INFER FUNDAMENTAL STELLAR PARAMETERS FROM PHOTOMETRIC LIGHT CURVES , 2014, 1411.1073.

[8]  Robert Armstrong,et al.  GalSim: The modular galaxy image simulation toolkit , 2014, Astron. Comput..

[9]  Jiangang Hao,et al.  ArborZ: PHOTOMETRIC REDSHIFTS USING BOOSTED DECISION TREES , 2009, The Astrophysical Journal.

[10]  Massimo Brescia,et al.  Machine-learning-based photometric redshifts for galaxies of the ESO Kilo-Degree Survey data release 2 , 2015 .

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Caltech,et al.  PREPARING FOR ADVANCED LIGO: A STAR–GALAXY SEPARATION CATALOG FOR THE PALOMAR TRANSIENT FACTORY , 2017, 1703.07356.

[13]  W. Ip,et al.  A search for subkilometer-sized ordinary chondrite like asteroids in the main-belt , 2015, 1504.01543.

[14]  Song Huang,et al.  The Hyper Suprime-Cam Software Pipeline , 2017, 1705.06766.

[15]  Uros Seljak,et al.  Shear calibration biases in weak-lensing surveys , 2003, astro-ph/0301054.

[16]  E. Bertin,et al.  SExtractor: Software for source extraction , 1996 .

[17]  O. Lahav,et al.  PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING , 2016, 1603.00882.

[18]  Kyler Kuehn,et al.  VDES J2325-5229 a z=2.7 gravitationally lensed quasar discovered using morphology-independent supervised machine learning , 2016, 1607.01391.

[19]  R. Kotak,et al.  Machine learning for transient discovery in Pan-STARRS1 difference imaging , 2015, 1501.05470.

[20]  E. Ishida,et al.  The first analytical expression to estimate photometric redshifts suggested by a machine , 2013, 1308.4145.

[21]  K. Gorski,et al.  HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere , 2004, astro-ph/0409513.

[22]  R. C. Wolf,et al.  AUTOMATED TRANSIENT IDENTIFICATION IN THE DARK ENERGY SURVEY , 2015, 1504.02936.

[23]  Shanoli Samui Pal,et al.  Photo-$z$ with CuBAN$z$: An improved photometric redshift estimator using Clustering aided Back Propagation Neural network , 2016, 1609.03568.

[24]  G. M. Bernstein,et al.  Shapes and Shears, Stars and Smears: Optimal Measurements for Weak Lensing , 2001 .

[25]  E. O. Ofek,et al.  Automating Discovery and Classification of Transients and Variable Stars in the Synoptic Survey Era , 2011, 1106.5491.

[26]  Yukiko Kamata,et al.  Hyper Suprime-Cam: Camera dewar design , 2018 .

[27]  D. Poznanski,et al.  The weirdest SDSS galaxies: results from an outlier detection algorithm , 2016, 1611.07526.

[28]  S. Kulkarni,et al.  Small Near-Earth Asteroids in the Palomar Transient Factory Survey: A Real-Time Streak-detection System , 2016, 1609.08018.

[29]  Satoshi Miyazaki,et al.  Searching for Moving Objects in HSC-SSP: Pipeline and Preliminary Results , 2017, 1705.01722.

[30]  Naonori Ueda,et al.  Machine-learning selection of optical transients in the Subaru/Hyper Suprime-Cam survey , 2016, 1609.03249.

[31]  M. Smith,et al.  Machine Learning Classification of SDSS Transient Survey Images , 2014, ArXiv.

[32]  M. Wainwright,et al.  Using machine learning for discovery in synoptic survey imaging data , 2012, 1209.3775.

[33]  Yanxia Zhang,et al.  Support vector machines for photometric redshift measurement of quasars , 2012, Other Conferences.

[34]  C. Heymans,et al.  The 2-degree Field Lensing Survey: photometric redshifts from a large new training sample to r < 19.5 , 2016, 1612.00839.

[35]  David W. Hogg,et al.  Using machine learning to explore the long-term evolution of GRS 1915+105 , 2017 .