Weakly-Supervised Anomaly Detection in the Milky Way

Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satellite. CWoLa operates without the use of labeled streams or knowledge of astrophysical principles. Instead, we train a classifier to distinguish between mixed samples for which the proportions of signal and background samples are unknown. This computationally lightweight strategy is able to detect both simulated streams and the known stream GD-1 in data. Originally designed for high-energy collider physics, this technique may have broad applicability within astrophysics as well as other domains interested in identifying localized anomalies.

[1]  M. Buckley,et al.  Via Machinae 2.0: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2 , 2023, 2303.01529.

[2]  Miguel de Val-Borro,et al.  The Astropy Project: Sustaining and Growing a Community-oriented Open-source Project and the Latest Major Release (v5.0) of the Core Package , 2022, The Astrophysical Journal.

[3]  S. Ho,et al.  Charting Galactic Accelerations with Stellar Streams and Machine Learning , 2022, The Astrophysical Journal.

[4]  C. Mateu galstreams: A Library of Milky Way Stellar Stream Footprints and Tracks , 2022, 2204.10326.

[5]  C. Conroy,et al.  Stellar Streams in the Galactic Disk: Predicted Lifetimes and Their Utility in Measuring the Galactic Potential , 2021, 2106.02050.

[6]  M. Buckley,et al.  Via Machinae: Searching for Stellar Streams using Unsupervised Machine Learning , 2021, 2104.12789.

[7]  N. Martin,et al.  Evidence of a Dwarf Galaxy Stream Populating the Inner Milky Way Halo , 2021, The Astrophysical Journal.

[8]  Hoang Dai Nghia Nguyen,et al.  Dijet Resonance Search with Weak Supervision Using sqrt[s]=13  TeV pp Collisions in the ATLAS Detector. , 2020, Physical review letters.

[9]  A. Helmi,et al.  Galactic potential constraints from clustering in action space of combined stellar stream data , 2020, Monthly Notices of the Royal Astronomical Society.

[10]  Benjamin D. Johnson,et al.  High-resolution Spectroscopy of the GD-1 Stellar Stream Localizes the Perturber near the Orbital Plane of Sagittarius , 2020, The Astrophysical Journal.

[11]  B. Nachman,et al.  Anomaly detection with density estimation , 2020, Physical Review D.

[12]  M. Gieles,et al.  A closer look at the spur, blob, wiggle, and gaps in GD-1 , 2019, Monthly notices of the Royal Astronomical Society.

[13]  Takuya Akiba,et al.  Optuna: A Next-generation Hyperparameter Optimization Framework , 2019, KDD.

[14]  S. Martell,et al.  Identifying stellar streams in Gaia DR2 with data mining techniques , 2019, Monthly Notices of the Royal Astronomical Society.

[15]  Katherine Freese,et al.  Butterfly in a Cocoon, Understanding the Origin and Morphology of Globular Cluster Streams: The Case of GD-1 , 2019, The Astrophysical Journal.

[16]  B. Nachman,et al.  Extending the search for new resonances with machine learning , 2019, Physical Review D.

[17]  Adrian M. Price-Whelan,et al.  The Spur and the Gap in GD-1: Dynamical Evidence for a Dark Substructure in the Milky Way Halo , 2018, The Astrophysical Journal.

[18]  P. Hopkins,et al.  Under the FIRElight: Stellar Tracers of the Local Dark Matter Velocity Distribution in the Milky Way , 2018, The Astrophysical Journal.

[19]  J. Bovy,et al.  Effects of baryonic and dark matter substructure on the Pal 5 stream , 2018, Monthly Notices of the Royal Astronomical Society.

[20]  Anthony G. A. Brown,et al.  The merger that led to the formation of the Milky Way’s inner stellar halo and thick disk , 2018, Nature.

[21]  B. Nachman,et al.  Anomaly Detection for Resonant New Physics with Machine Learning. , 2018, Physical review letters.

[22]  Adrian M. Price-Whelan,et al.  Off the Beaten Path: Gaia Reveals GD-1 Stars outside of the Main Stream , 2018, The Astrophysical Journal.

[23]  R. Ibata,et al.  STREAMFINDER - I. A new algorithm for detecting stellar streams , 2018, 1804.11338.

[24]  et al,et al.  Gaia Data Release 2 , 2018, Astronomy & Astrophysics.

[25]  Abien Fred Agarap Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.

[26]  Sergey E. Koposov,et al.  Co-formation of the disc and the stellar halo , 2018, 1802.03414.

[27]  Sergey E. Koposov,et al.  A deeper look at the GD1 stream: density variations and wiggles , 2018, 1801.08948.

[28]  Miguel de Val-Borro,et al.  The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package , 2018, The Astronomical Journal.

[29]  University of Surrey,et al.  Fourteen candidate RR Lyrae star streams in the inner Galaxy , 2017, 1711.03967.

[30]  F. Timmes,et al.  Modules for Experiments in Stellar Astrophysics ( ): Convective Boundaries, Element Diffusion, and Massive Star Explosions , 2017, 1710.08424.

[31]  Adrian M. Price-Whelan,et al.  Gala: A Python package for galactic dynamics , 2017, J. Open Source Softw..

[32]  B. Nachman,et al.  Classification without labels: learning from mixed samples in high energy physics , 2017, 1708.02949.

[33]  P. McMillan,et al.  The mass distribution and gravitational potential of the Milky Way , 2016, 1608.00971.

[34]  J. Bovy,et al.  The number and size of subhalo-induced gaps in stellar streams , 2016, 1606.04946.

[35]  Jieun Choi,et al.  MESA ISOCHRONES AND STELLAR TRACKS (MIST). I. SOLAR-SCALED MODELS , 2016, 1604.08592.

[36]  Aaron Dotter,et al.  MESA ISOCHRONES AND STELLAR TRACKS (MIST) 0: METHODS FOR THE CONSTRUCTION OF STELLAR ISOCHRONES , 2016, 1601.05144.

[37]  J. Bovy,et al.  Dynamics of stream–subhalo interactions , 2015, 1510.03426.

[38]  Dean M. Townsley,et al.  MODULES FOR EXPERIMENTS IN STELLAR ASTROPHYSICS (MESA): BINARIES, PULSATIONS, AND EXPLOSIONS , 2015, 1506.03146.

[39]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Prasanth H. Nair,et al.  Astropy: A community Python package for astronomy , 2013, 1307.6212.

[42]  Paul M. Brunet,et al.  The Gaia mission , 2013, 1303.0303.

[43]  M. H. Montgomery,et al.  MODULES FOR EXPERIMENTS IN STELLAR ASTROPHYSICS (MESA): PLANETS, OSCILLATIONS, ROTATION, AND MASSIVE STARS , 2013, 1301.0319.

[44]  C. Grillmair,et al.  THE PAL 5 STAR STREAM GAPS , 2012, 1209.1741.

[45]  A. Zentner,et al.  Dark Matter Direct Search Rates in Simulations of the Milky Way and Sagittarius Stream , 2012, 1203.6617.

[46]  L. Costa,et al.  The tidal tails of NGC 2298 , 2011, 1105.1933.

[47]  Kenneth C. Freeman,et al.  THE DAWNING OF THE STREAM OF AQUARIUS IN RAVE , 2010, 1012.2127.

[48]  Frank Timmes,et al.  MODULES FOR EXPERIMENTS IN STELLAR ASTROPHYSICS (MESA) , 2010, 1009.1622.

[49]  S. Majewski,et al.  THE SAGITTARIUS DWARF GALAXY: A MODEL FOR EVOLUTION IN A TRIAXIAL MILKY WAY HALO , 2010, 1003.1132.

[50]  Cambridge,et al.  CONSTRAINING THE MILKY WAY POTENTIAL WITH A SIX-DIMENSIONAL PHASE-SPACE MAP OF THE GD-1 STELLAR STREAM , 2009, 0907.1085.

[51]  J. Binney,et al.  Locating the orbits delineated by tidal streams , 2009, 0907.0360.

[52]  Princeton,et al.  The Field of Streams: Sagittarius and Its Siblings , 2006, astro-ph/0605025.

[53]  C. Grillmair,et al.  Detection of a 63° Cold Stellar Stream in the Sloan Digital Sky Survey , 2006, astro-ph/0604332.

[54]  M. I. Arifyanto,et al.  Fine structure in the phase space distribution of nearby subdwarfs , 2005, astro-ph/0512296.

[55]  G. Carraro,et al.  Spectroscopy of QUEST RR Lyrae Variables: The New Virgo Stellar Stream , 2005, astro-ph/0510589.

[56]  H. Rix,et al.  Modeling the Disruption of the Globular Cluster Palomar 5 by Galactic Tides , 2004, astro-ph/0401422.

[57]  Walter Dehnen,et al.  A Matched-Filter Analysis of the Tidal Tails of the Globular Cluster Palomar 5 , 2002 .

[58]  A. Helmi,et al.  Building up the stellar halo of the Galaxy , 1999, astro-ph/9901102.

[59]  K. Johnston A Prescription for Building the Milky Way's Halo from Disrupted Satellites , 1997, astro-ph/9710007.

[60]  L. Hernquist,et al.  Fossil Signatures of Ancient Accretion Events in the Halo , 1995, astro-ph/9602060.

[61]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[62]  O. Eggen THE ARCTURUS GROUP , 1971 .

[63]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[64]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[65]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .