Neural Gas Based Classification of Globular Clusters

Within scientific and real life problems, classification is a typical case of extremely complex tasks in data-driven scenarios, especially if approached with traditional techniques. Machine Learning supervised and unsupervised paradigms, providing self-adaptive and semi-automatic methods, are able to navigate into large volumes of data characterized by a multi-dimensional parameter space, thus representing an ideal method to disentangle classes of objects in a reliable and efficient way. In Astrophysics, the identification of candidate Globular Clusters through deep, wide-field, single band images, is one of such cases where self-adaptive methods demonstrated a high performance and reliability. Here we experimented some variants of the known Neural Gas model, exploring both supervised and unsupervised paradigms of Machine Learning for the classification of Globular Clusters. Main scope of this work was to verify the possibility to improve the computational efficiency of the methods to solve complex data-driven problems, by exploiting the parallel programming with GPU framework. By using the astrophysical playground, the goal was to scientifically validate such kind of models for further applications extended to other contexts.

[1]  Stephen V. Stehman,et al.  Selecting and interpreting measures of thematic classification accuracy , 1997 .

[2]  E. Bertin,et al.  SExtractor: Software for source extraction , 1996 .

[3]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[4]  Paul Goudfrooij,et al.  PROBING THE GC-LMXB CONNECTION IN NGC 1399: A WIDE-FIELD STUDY WITH THE HUBBLE SPACE TELESCOPE AND CHANDRA , 2011, 1105.2561.

[5]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[6]  Surapong Auwatanamongkol,et al.  A supervised growing neural gas algorithm for cluster analysis , 2007, Int. J. Hybrid Intell. Syst..

[7]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[8]  Bernd Fritzke Supervised Learning with Growing Cell Structures , 1993, NIPS.

[9]  Massimo Brescia,et al.  An analysis of feature relevance in the classification of astronomical transients with machine learning methods , 2016, 1601.03931.

[10]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[11]  Laura P. DunnHelmut Jerjen First Results from SAPAC: Toward a Three-dimensional Picture of the Fornax Cluster Core , 2006 .

[12]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[13]  E. Steyerberg,et al.  [Regression modeling strategies]. , 2011, Revista espanola de cardiologia.

[14]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[15]  José L. F. Abascal,et al.  The Voronoi polyhedra as tools for structure determination in simple disordered systems , 1993 .

[16]  Antonio Pescapè,et al.  Genetic Algorithm Modeling with GPU Parallel Computing Technology , 2012, WIRN.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  G. Longo,et al.  Astrophysical data mining with GPU. A case study: Genetic classification of globular clusters , 2013, 1304.0597.

[19]  Paul Goudfrooij,et al.  WIDE-FIELD HUBBLE SPACE TELESCOPE OBSERVATIONS OF THE GLOBULAR CLUSTER SYSTEM IN NGC 1399 , 2014, 1402.6714.

[20]  G. Longo,et al.  Astroinformatics, data mining and the future of astronomical research , 2012, 1201.1867.

[21]  M. Brescia,et al.  Inside Catalogs: A Comparison of Source Extraction Software , 2012, 1212.0564.

[22]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[23]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[24]  Massimo Brescia,et al.  Astrophysical Data Analytics based on Neural Gas Models, using the Classification of Globular Clusters as Playground , 2017, DAMDID/RCDL.

[25]  Jon A. Holtzman,et al.  Measuring Sizes of Marginally Resolved Young Globular Clusters with the Hubble Space Telescope , 2001, astro-ph/0109460.

[26]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[27]  Mauro Garofalo,et al.  DAMEWARE: A Web Cyberinfrastructure for Astrophysical Data Mining , 2014, 1406.3538.

[28]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[29]  M. Brescia,et al.  The detection of globular clusters in galaxies as a data mining problem , 2011 .