Combining Human and Machine Learning for Morphological Analysis of Galaxy Images

AbstractThe increasing importance of digital sky surveys collecting many millions of galaxy images has reinforced the need for robust methods that can perform morphological analysis of large galaxy image databases. Citizen science initiatives such as Galaxy Zoo showed that large data sets of galaxy images can be analyzed effectively by nonscientist volunteers, but since databases generated by robotic telescopes grow much faster than the processing power of any group of citizen scientists, it is clear that computer analysis is required. Here, we propose to use citizen science data for training machine learning systems, and show experimental results demonstrating that machine learning systems can be trained with citizen science data. Our findings show that the performance of machine learning depends on the quality of the data, which can be improved by using samples that have a high degree of agreement between the citizen scientists. The source code of the method is publicly available.

[1]  G. Bruce Berriman,et al.  Astrophysics Source Code Library , 2012, ArXiv.

[2]  L. Ho,et al.  Detailed structural decomposition of galaxy images , 2002, astro-ph/0204182.

[3]  Wayne B. Hayes,et al.  SpArcFiRe: SCALABLE AUTOMATED DETECTION OF SPIRAL GALAXY ARM SEGMENTS , 2014, 1402.1910.

[4]  L. Shamir,et al.  Automatic quantitative morphological analysis of interacting galaxies , 2013, Astron. Comput..

[5]  P. Murdin MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY , 2005 .

[6]  D. Block,et al.  Dust-penetrated arm classes: insights from rising and falling rotation curves , 2004, astro-ph/0502587.

[7]  V. Narayanan,et al.  Spectroscopic Target Selection in the Sloan Digital Sky Survey: The Main Galaxy Sample , 2002, astro-ph/0206225.

[8]  Lior Shamir,et al.  Knee X-Ray Image Analysis Method for Automated Detection of Osteoarthritis , 2009, IEEE Transactions on Biomedical Engineering.

[9]  Joan E. Beaudoin,et al.  BULLETIN of the American Society for Information Science and Technology June / July 2007 , 2007 .

[10]  Lior Shamir,et al.  Computer analysis of art , 2012, JOCCH.

[11]  C. Lintott,et al.  Galaxy Zoo 2: detailed morphological classifications for 304,122 galaxies from the Sloan Digital Sky Survey , 2013, 1308.3496.

[12]  Lior Shamir,et al.  WND-CHARM: Multi-purpose image classifier , 2013 .

[13]  C. Lintott,et al.  Galaxy Zoo 1: data release of morphological classifications for nearly 900 000 galaxies , 2010, 1007.3265.

[14]  E. al.,et al.  The Sloan Digital Sky Survey: Technical summary , 2000, astro-ph/0006396.

[15]  C. J. Conselice,et al.  New image statistics for detecting disturbed galaxy morphologies at high redshift , 2013, 1306.1238.

[16]  Marc Huertas-Company,et al.  Revisiting the Hubble sequence in the SDSS DR7 spectroscopic sample: a publicly available Bayesian automated classification , 2010, 1010.3018.

[17]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[18]  Lior Shamir,et al.  Impressionism, expressionism, surrealism: Automated recognition of painters and schools of art , 2010, TAP.

[19]  C. Lintott,et al.  Galaxy Zoo: reproducing galaxy morphologies via machine learning★ , 2009, 0908.2033.

[20]  University of Toronto,et al.  A New Approach to Galaxy Morphology. I. Analysis of the Sloan Digital Sky Survey Early Data Release , 2003, astro-ph/0301239.

[21]  C. Lintott,et al.  Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey , 2008, 0804.4483.

[22]  Lior Shamir,et al.  Automatic detection and quantitative assessment of peculiar galaxy pairs in Sloan Digital Sky Survey , 2014, 1407.5000.

[23]  Lior Shamir,et al.  MRI-based knee image for personal identification , 2013, Int. J. Biom..

[24]  Chien Y. Peng,et al.  DETAILED DECOMPOSITION OF GALAXY IMAGES. II. BEYOND AXISYMMETRIC MODELS , 2009, 0912.0731.

[25]  Chris J. Lintott,et al.  Galaxy Zoo: A Catalog of Overlapping Galaxy Pairs for Dust Studies , 2012, 1211.6723.

[26]  M. Teague Image analysis via the general theory of moments , 1980 .

[27]  British Ornithologists,et al.  Bulletin of the , 1999 .

[28]  Luc Simard,et al.  A CATALOG OF BULGE+DISK DECOMPOSITIONS AND UPDATED PHOTOMETRY FOR 1.12 MILLION GALAXIES IN THE SLOAN DIGITAL SKY SURVEY , 2011, 1107.1518.

[29]  Lior Shamir,et al.  Automatic morphological classification of galaxy images. , 2009, Monthly notices of the Royal Astronomical Society.

[30]  Lior Shamir,et al.  Automatic detection of peculiar galaxies in large datasets of galaxy images , 2012, J. Comput. Sci..

[31]  Lior Shamir,et al.  Pattern Recognition Software and Techniques for Biological Image Analysis , 2010, PLoS Comput. Biol..

[32]  Lior Shamir,et al.  WND-CHARM: Multi-purpose image classification using compound image transforms , 2008, Pattern Recognit. Lett..

[33]  Christopher J. Conselice,et al.  The Relationship between Stellar Light Distributions of Galaxies and Their Formation Histories , 2003 .

[34]  Lior Shamir,et al.  Source Code for Biology and Medicine Open Access Wndchrm – an Open Source Utility for Biological Image Analysis , 2022 .

[35]  Alexander S. Szalay,et al.  Galaxy Zoo: the dependence of morphology and colour on environment , 2008, 0805.2612.

[36]  Lior Shamir,et al.  IICBU 2008: a proposed benchmark suite for biological image analysis , 2008, Medical & Biological Engineering & Computing.