Methods for multidimensional event classification: A case study using images from a Cherenkov gamma-ray telescope

Abstract We present results from a case study comparing different multivariate classification methods. The input is a set of Monte Carlo data, generated and approximately triggered and pre-processed for an imaging gamma-ray Cherenkov telescope. Such data belong to two classes, originating either from incident gamma rays or caused by hadronic showers. There is only a weak discrimination between signal (gamma) and background (hadrons), making the data an excellent proving ground for classification techniques. The data and methods are described, and a comparison of the results is made. Several methods give results comparable in quality within small fluctuations, suggesting that they perform at or close to the Bayesian limit of achievable separation. Other methods give clearly inferior or inconclusive results. Some problems that this study can not address are also discussed.

[1]  Stanley J. Farlow,et al.  Self-Organizing Methods in Modeling: Gmdh Type Algorithms , 1984 .

[2]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[3]  I. Jolliffe Principal Component Analysis , 2002 .

[4]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[5]  D. Fegan hadron separation at TeV energies , 1997 .

[6]  Toby Walsh,et al.  Proceedings of AAAI-96 , 1996 .

[7]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[8]  A. Vaiciulis,et al.  Support vector machines in analysis of top quark production , 2002 .

[9]  M. Jirina,et al.  The Modified GMDH: Sigmoidal and Polynomial Neural Net , 1994 .

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[12]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[15]  A. Vardanyan,et al.  Multivariate approach for selecting sets of differentially expressed genes. , 2002, Mathematical biosciences.

[16]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[17]  Garrido,et al.  Discriminating signal from background using neural networks: Application to top-quark search at the Fermilab Tevatron. , 1996, Physical review. D, Particles and fields.

[18]  J. Knapp,et al.  CORSIKA: A Monte Carlo code to simulate extensive air showers , 1998 .

[19]  E. Lorenz,et al.  A method to correct HILLAS parameters of imaging Cherenkov telescope data taken at different background light levels , 2001 .

[20]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[21]  M. Gaug AMANDA event reconstruction and cut evaluation methods , 2002 .

[22]  F. Samuelson,et al.  Kernel analysis in TeV gamma-ray selection , 2001 .