The detection of globular clusters in galaxies as a data mining problem

ABSTRACT We present an application of self-adaptive supervised learning classifiers derived fromthe Machine Learning paradigm, to the identification of candidate Globular Clus-ters in deep, wide-field, single band HST images. Several methods provided by theDAME (Data Mining & Exploration) web application, were tested and compared onthe NGC1399 HST data described in Paolillo et al. (2011). The best results were ob-tained using a Multi LayerPerceptronwith Quasi Newton learningrule which achieveda classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% contami-nation. An extensive set of experiments revealed that the use of accurate structuralparameters (effective radius, central surface brightness) does improve the final result,but only by ∼5%. It is also shown that the method is capable to retrieve also extremesources (for instance, very extended objects) which are missed by more traditionalapproaches.Key words: Globular clusters; elliptical galaxies; NGC1399; Machine Learning

[1]  Thomas H. Puzia,et al.  The ages of globular clusters in NGC 4365 revisited with deep HST observations , 2005 .

[2]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[3]  Shihab A. Shamma,et al.  Minimum mean square error estimation of connectivity in biological neural networks , 1991, Biological Cybernetics.

[4]  Jon A. Holtzman,et al.  Measuring Sizes of Marginally Resolved Young Globular Clusters with the Hubble Space Telescope , 2001, astro-ph/0109460.

[5]  K. Gebhardt,et al.  The Globular Cluster System of NGC 1399. III. VLT Spectroscopy and Database , 2004 .

[6]  Chile,et al.  Large-scale study of the ngc 1399 globular cluster system in fornax , 2006, astro-ph/0603349.

[7]  Dirk P. Kroese,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning , 2004 .

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Mauro Garofalo,et al.  DAME: A Web Oriented Infrastructure for Scientific Data Mining & Exploration , 2010, ArXiv.

[10]  William C. Davidon,et al.  Variable Metric Method for Minimization , 1959, SIAM J. Optim..

[11]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[12]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[13]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations , 1970 .

[14]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[15]  D. Goldfarb A family of variable-metric methods derived by variational means , 1970 .

[16]  Paul Goudfrooij,et al.  PROBING THE GC-LMXB CONNECTION IN NGC 1399: A WIDE-FIELD STUDY WITH THE HUBBLE SPACE TELESCOPE AND CHANDRA , 2011, 1105.2561.

[17]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[18]  Laura P. DunnHelmut Jerjen First Results from SAPAC: Toward a Three-dimensional Picture of the Fornax Cluster Core , 2006 .

[19]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Jorge Nocedal,et al.  Representations of quasi-Newton matrices and their use in limited memory methods , 1994, Math. Program..

[22]  L. Ho,et al.  Detailed structural decomposition of galaxy images , 2002, astro-ph/0204182.