Detection of Pulsar Candidates using Bagging Method

Abstract The pulsar classification represents a major issue in the astrophysical area. The Bagging Algorithm is an ensemble method widely used to improve the performance of classification algorithms, especially in the case of pulsar search. In this way, our paper tries to prove how the Bagging Method can improve the performance of pulsar candidate detection in connection with four basic classifiers: Core Vector Machines (CVM), the K-Nearest-Neighbors (KNN), the Artificial Neural Network (ANN), and Cart Decision Tree (CDT). The Error Rate, Area Under the Curve (AUC), and Computation Time (CT) are measured to compare the performance of different classifiers. The High Time Resolution Universe (HTRU2) dataset, collected from the UCI Machine Learning Repository, is used in the experimentation phase.

[1]  Hyun-Chul Kim,et al.  Support Vector Machine Ensemble with Bagging , 2002, SVM.

[2]  G. Desvignes,et al.  PEACE: Pulsar Evaluation Algorithm for Candidate Extraction - A Software Package for Post-analysis Processing of Pulsar Survey Candidates , 2013, 1305.0447.

[3]  R. P. Eatough,et al.  Selection of radio pulsar candidates using artificial neural networks , 2010, 1005.5068.

[4]  S. Burke-Spolaor,et al.  The High Time Resolution Universe Pulsar Survey - VI. An artificial neural network and timing of 75 pulsars , 2012, 1209.0793.

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[8]  Christian W. Dawson,et al.  An artificial neural network approach to rainfall-runoff modelling , 1998 .

[9]  Joshua D. Knowles,et al.  Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach , 2016, Monthly Notices of the Royal Astronomical Society.

[10]  David A. Elizondo,et al.  Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks , 2008, Decis. Support Syst..

[11]  R. Real,et al.  AUC: a misleading measure of the performance of predictive distribution models , 2008 .

[12]  Rajeev Kumar,et al.  Receiver operating characteristic (ROC) curve for medical researchers , 2011, Indian pediatrics.

[13]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[14]  A Divided Latent Class analysis for Big Data , 2017, FNC/MobiSPC.

[15]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[16]  Abdallah Abarda,et al.  Using Ensemble Methods to Solve the Problem of Pulsar Search , 2019, Big Data and Networks Technologies.

[17]  Jean-Philippe Vert,et al.  A bagging SVM to learn from positive and unlabeled examples , 2010, Pattern Recognit. Lett..

[18]  Browne,et al.  Cross-Validation Methods. , 2000, Journal of mathematical psychology.