Interactive Image Feature Selection Aided by Dimensionality Reduction

Feature selection is an important step in designing image classification systems. While many automatic feature selection methods exist, most of them are opaque to their users. We consider that users should be able to gain insight into how observations behave in the feature space, since this may allow the design of better features and the incorporation of domain knowledge. For this purpose, we propose a methodology for interactive and iterative selection of image features aided by dimensionality reduction plots and complementary exploration tools. We evaluate our proposal on the problem of feature selection for skin lesion image classification.

[1]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[2]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[3]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[4]  Enrico Bertini,et al.  Quality Metrics in High-Dimensional Data Visualization: An Overview and Systematization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[7]  Charl P. Botha,et al.  Piece wise Laplacian‐based Projection for Interactive Data Exploration and Organization , 2011, Comput. Graph. Forum.

[8]  Chih-Fong Tsai,et al.  Bag-of-Words Representation in Image Annotation: A Review , 2012 .

[9]  David Gotz,et al.  Connecting the dots in visual analysis , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[10]  Luis Gustavo Nonato,et al.  Local Affine Multidimensional Projection , 2011, IEEE Transactions on Visualization and Computer Graphics.

[11]  Stephan Dreiseitl,et al.  Do physicians value decision support? A look at the effect of decision support systems on physician opinion , 2005, Artif. Intell. Medicine.

[12]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[14]  Xiaoru Yuan,et al.  Dimension Projection Matrix/Tree: Interactive Subspace Visual Exploration and Analysis of High Dimensional Data , 2013, IEEE Transactions on Visualization and Computer Graphics.

[15]  Rafael García,et al.  Computerized analysis of pigmented skin lesions: A review , 2012, Artif. Intell. Medicine.

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Md. Monirul Islam,et al.  A review on automatic image annotation techniques , 2012, Pattern Recognit..

[18]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Enrico Bertini,et al.  INFUSE: Interactive Feature Selection for Predictive Modeling of High Dimensional Data , 2014, IEEE Transactions on Visualization and Computer Graphics.

[20]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[21]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[22]  Daniel A. Keim,et al.  Subspace search and visualization to make sense of alternative clusterings in high-dimensional data , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[23]  Paul Nghiem,et al.  Interactive Atlas of Dermoscopy , 2004 .

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Haim Levkowitz,et al.  Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping , 2008, IEEE Transactions on Visualization and Computer Graphics.

[26]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[27]  Peter Filzmoser,et al.  Brushing Dimensions - A Dual Visual Analysis Model for High-Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.