Threshold optimization for classification in imbalanced data in a problem of gamma-ray astronomy

We introduce a method to minimize the mean square error (MSE) of an estimator which is derived from a classification. The method chooses an optimal discrimination threshold in the outcome of a classification algorithm and deals with the problem of unequal and unknown misclassification costs and class imbalance. The approach is applied to data from the MAGIC experiment in astronomy for choosing an optimal threshold for signal-background-separation. In this application one is interested in estimating the number of signal events in a dataset with very unfavorable signal to background ratio. Minimizing the MSE of the estimation is a rather general approach which can be adapted to various other applications, in which one wants to derive an estimator from a classification. If the classification depends on other or additional parameters than the discrimination threshold, MSE minimization can be used to optimize these parameters as well. We illustrate this by optimizing the parameters of logistic regression, leading to relevant improvements of the current approach used in the MAGIC experiment.

[1]  S. Nolan,et al.  TeV gamma-ray astronomy , 2008 .

[2]  The DISP analysis method for point-like or extended gamma source searches/studies with the MAGIC Telescope , 2005 .

[3]  P. V. Ramana Murthy,et al.  Very high-energy gamma-ray astronomy , 1982, Nature.

[4]  V. Connaughton,et al.  A new analysis method for reconstructing the arrival direction of TeV gamma rays using a single imaging atmospheric Cherenkov telescope , 2000, astro-ph/0005468.

[5]  A. Chilingarian,et al.  Implementation of the Random Forest method for the Imaging Atmospheric Cherenkov Telescope MAGIC , 2007, 0709.3719.

[6]  Victor S. Sheng,et al.  Thresholding for Making Classifiers Cost-sensitive , 2006, AAAI.

[7]  D. Mazin A study of very high energy $\gamma$-ray emission from AGNs and constraints on the extragalactic background light , 2015 .

[8]  D. Hadasch,et al.  Study of the MAGIC performance at high zenith angles and application of the results on a very high energy gamma ray flare of the blazar PKS 2155-304 , 2022 .

[9]  N Tonello,et al.  The MAGIC data processing pipeline , 2011 .

[10]  Reinhard Schlickeiser,et al.  Cosmic Ray Astrophysics , 2002 .

[11]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[12]  Petr Savický,et al.  Methods for multidimensional event classification: A case study using images from a Cherenkov gamma-ray telescope , 2004 .

[13]  Felix Aharonian,et al.  Very High Energy Cosmic Gamma Radiation: A Crucial Window on the Extreme Universe , 2004 .

[14]  A. Daum,et al.  Stereoscopic imaging of air showers with the first two HEGRA Cherenkov telescopes , 1996 .

[15]  Vincent Marandon,et al.  A new analysis strategy for detection of faint γ-ray sources with Imaging Atmospheric Cherenkov Telescopes , 2011, 1104.5359.

[16]  NEURAL NETWORKS FOR GAMMA-HADRON SEPARATION IN MAGIC , 2005, astro-ph/0503539.

[17]  W. Rhode,et al.  Solving inverse problems with the unfolding program TRUEE: Examples in astroparticle physics , 2012, 1209.3218.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[20]  Heidelberg,et al.  Gamma-Hadron Separation in Very-High-Energy gamma-ray astronomy using a multivariate analysis method , 2009, 0904.1136.

[21]  Ti-Pei Li,et al.  Analysis methods for results in gamma-ray astronomy , 1983 .

[22]  V. P. Fomin,et al.  New methods of atmospheric Cherenkov imaging for gamma-ray astronomy. I. The false source method , 1994 .

[23]  A. Hillas Cerenkov light images of EAS produced by primary gamma , 1985 .

[24]  L. A. Antonelli,et al.  Performance of the MAGIC stereo system obtained with Crab Nebula data , 2011, Astroparticle Physics.

[25]  E. al.,et al.  Unfolding of differential energy spectra in the MAGIC experiment , 2007, 0707.2453.

[26]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[27]  D. Sobczynska Natural limit on the γ/hadron separation for a stand alone air Cherenkov telescope , 2007, astro-ph/0702562.

[28]  Magic Collaboration Improving the performance of the single-dish Cherenkov telescope MAGIC through the use of signal timing , 2008, 0810.3568.

[29]  J. Alberta,et al.  Unfolding of differential energy spectra in the MAGIC experiment , 2007 .

[30]  W. Hofmann,et al.  Teraelectronvolt Astronomy , 2010, 1006.5210.

[31]  M. Moles,et al.  MAGIC TeV gamma-ray observations of Markarian 421 during multiwavelength campaigns in 2006 , 2010, 1001.1291.

[32]  O. Blanch,et al.  Monte Carlo simulation for the MAGIC telescope , 2005 .

[33]  Munchen,et al.  Monte Carlo Simulation for the MAGIC-II System , 2007, 0709.2959.