When Too Similar Is Bad: A Practical Example of the Solar Dynamics Observatory Content-Based Image-Retrieval System

The measuring of interest and relevance have always been some of the main concerns when analyzing the results of a Content-Based Image-Retrieval (CBIR) system. In this work, we present a unique problem that the Solar Dynamics Observatory (SDO) CBIR system encounters: too many highly similar images. Producing over 70,000 images of the Sun per day, the problem of finding similar images is transformed into the problem of finding similar solar events based on image similarity. However, the most similar images of our dataset are temporal neighbors capturing the same event instance. Therefore a traditional CBIR system will return highly repetitive images rather than similar but distinct events. In this work we outline the problem in detail, present several approaches tested in order to solve this important image data mining and information retrieval issue.

[1]  Wolfgang Heidl,et al.  A Quantitative Evaluation of Texture Feature Robustness and Interpolation Behaviour , 2009, CAIP.

[2]  Rafal A. Angryk,et al.  A Comparative Evaluation of Automated Solar Filament Detection , 2012 .

[3]  Chih-Yi Chiu,et al.  LinStar texture: a fuzzy logic CBIR system for textures , 2001, MULTIMEDIA '01.

[4]  Paul L. Rosin,et al.  Incorporating shape into histograms for CBIR , 2002, Pattern Recognit..

[5]  Shane Strasser,et al.  Graph-based ontology-guided data mining for D-matrix model maturation , 2011, 2011 Aerospace Conference.

[6]  Karthik Ganesan Pillai,et al.  A large-scale solar image dataset with labeled event regions , 2013, 2013 IEEE International Conference on Image Processing.

[7]  Lei Zhang,et al.  A CBIR method based on color-spatial feature , 1999, Proceedings of IEEE. IEEE Region 10 Conference. TENCON 99. 'Multimedia Technology for Asia-Pacific Information Infrastructure' (Cat. No.99CH37030).

[8]  Rafal A. Angryk,et al.  On the effectiveness of fuzzy clustering as a data discretization technique for large-scale classification of solar images , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[9]  Wei-Ying Ma,et al.  Image and Video Retrieval , 2003, Lecture Notes in Computer Science.

[10]  John W. Sheppard,et al.  Cluster Analysis for Optimal Indexing , 2013, FLAIRS Conference.

[11]  John W. Sheppard,et al.  Evolving Kernel Functions with Particle Swarms and Genetic Programming , 2012, FLAIRS.

[12]  Qutaibah M. Malluhi,et al.  Advances in Intelligent Systems and Computing , 2015 .

[13]  Rafal A. Angryk,et al.  Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing , 2013, ADBIS.

[14]  Rafal A. Angryk,et al.  Introducing the first publicly available Content-Based Image-Retrieval system for the Solar Dynamics Observatory mission , 2013 .

[15]  Shane Strasser,et al.  Ontology-guided knowledge discovery of event sequences in maintenance data , 2011, 2011 IEEE AUTOTESTCON.

[16]  Rafal A. Angryk,et al.  On the surprisingly accurate transfer of image parameters between medical and solar images , 2011, 2011 18th IEEE International Conference on Image Processing.

[17]  N. Raouafi,et al.  Computer Vision for the Solar Dynamics Observatory (SDO) , 2012 .

[18]  Rafal A. Angryk,et al.  Mitigating the Curse of Dimensionality for Exact kNN Retrieval , 2014, FLAIRS Conference.

[19]  Rafal A. Angryk,et al.  A Comprehensive Study of iDistance Partitioning Strategies for kNN Queries and High-Dimensional Data Indexing , 2013, BNCOD.

[20]  Brijesh Verma,et al.  Fuzzy logic based texture queries for CBIR , 2003, Proceedings Fifth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2003.

[21]  Karthik Ganesan Pillai,et al.  Big Data New Frontiers: Mining, Search and Management of Massive Repositories of Solar Image Data and Solar Events , 2013, ADBIS.

[22]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Bo Zhang,et al.  Learning in Region-Based Image Retrieval , 2003, CIVR.

[24]  Karthik Ganesan Pillai,et al.  Spatio-temporal Co-occurrence Pattern Mining in Data Sets with Evolving Regions , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[25]  Stanley S. Ipson,et al.  Active Region Detection and Verification With the Solar Feature Catalogue , 2006 .

[26]  Rafal A. Angryk,et al.  Usage of Dissimilarity Measures and Multidimensional Scaling for Large Scale Solar Data Analysis , 2010, CIDU.