Detecting Quasars in Large-Scale Astronomical Surveys

We present a classification-based approach to identify quasi-stellar radio sources (quasars) in the Sloan Digital Sky Survey and evaluate its performance on a manually labeled training set. While reasonable results can already be obtained via approaches working only on photometric data, our experiments indicate that simple but problem-specific features extracted from spectroscopic data can significantly improve the classification performance. Since our approach works orthogonal to existing classification schemes used for building the spectroscopic catalogs, our classification results are well suited for a mutual assessment of the approaches' accuracies.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[3]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[4]  P. Padovani,et al.  UNIFIED SCHEMES FOR RADIO-LOUD ACTIVE GALACTIC NUCLEI , 1995, astro-ph/9506063.

[5]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[6]  Simon Parsons,et al.  Introduction to Machine Learning, Second Editon by Ethem Alpaydin, MIT Press, 584 pp., ISBN 978-0-262-01243-0 , 2010, The Knowledge Engineering Review.

[7]  Begnaud Francis Hildebrand,et al.  Introduction to numerical analysis: 2nd edition , 1987 .

[8]  Chao Zhai,et al.  The Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) , 2012 .

[9]  Zhao,et al.  Large Sky Area Multi-Object Fiber Spectroscopic Telescope , 2011 .

[10]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[13]  K. Gebhardt,et al.  Accretion Disk Temperatures and Continuum Colors in QSOs , 2006, astro-ph/0611263.

[14]  Kirk D. Borne,et al.  Scientific Data Mining in Astronomy , 2009, Next Generation of Data Mining.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  Canada.,et al.  Data Mining and Machine Learning in Astronomy , 2009, 0906.2173.

[17]  A. Szalay,et al.  THE SLOAN DIGITAL SKY SURVEY QUASAR CATALOG. V. SEVENTH DATA RELEASE , 2010, 1004.1167.

[18]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.