Learning the Relative Importance of Features in Image Data

In computational analysis in scientific domains, images are often compared based on their features, e.g., size, depth and other domain-specific aspects. Certain features may be more significant than others while comparing the images and drawing corresponding inferences for specific applications. Though domain experts may have subjective notions of similarity for comparison, they seldom have a distance function that ranks the image features based on their relative importance. We propose a method called features rank for learning such a distance function in order to capture the semantics of the images. We are given training samples with pairs of images and the extent of similarity identified for each pair. Using a guessed initial distance function. Features rank clusters the given images in levels. It then adjusts the distance junction based on the error between the clusters and training samples using heuristics proposed in this paper. The distance junction that gives the lowest error is the output. This contains the features ranked in the order most appropriate the domain. Features rank is evaluated with real image data from nanotechnology and bioinformatics. The results of our evaluation are presented in the paper.

[1]  David Haussler,et al.  Mining scientific data , 1996, CACM.

[2]  Yoram Reich,et al.  Evaluating machine learning models for engineering problems , 1999, Artif. Intell. Eng..

[3]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[4]  Daniel A. Keim,et al.  Similarity search in multimedia databases , 2004, Proceedings. 20th International Conference on Data Engineering.

[5]  Elke A. Rundensteiner,et al.  Mining Images of Material Nanostructure Data , 2006, ICDCIT.

[6]  Carolina Ruiz,et al.  LearnMet: learning domain-specific distance metrics for plots of scientific functions , 2007, Multimedia Tools and Applications.

[7]  S. Sudarshan,et al.  Ordering the attributes of query results , 2006, SIGMOD Conference.

[8]  Robert P. W. Duin,et al.  A Trainable Similarity Measure for Image Classification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  Thomas S. Huang,et al.  Relevance feedback techniques in interactive content-based image retrieval , 1997, Electronic Imaging.

[10]  Robert M. Haralick,et al.  Probabilistic vs. geometric similarity measures for image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[11]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[12]  James Ze Wang,et al.  Content-based image indexing and searching using Daubechies' wavelets , 1998, International Journal on Digital Libraries.

[13]  Gerhard Weikum,et al.  Probabilistic Ranking of Database Query Results , 2004, VLDB.

[14]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[15]  Elke A. Rundensteiner,et al.  Effectiveness of Domain-Specific Cluster Representatives for Graphical Plots , 2022 .

[16]  Norma Banas,et al.  Visualization , 1968, Machine-mediated learning.

[17]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.