Assessment of dimensionality reduction based on communication channel model; application to immersive information visualization

We are dealing with large-scale high-dimensional image data sets requiring new approaches for data mining where visualization plays the main role. Dimension reduction (DR) techniques are widely used to visualize high-dimensional data. However, the information loss due to reducing the number of dimensions is the drawback of DRs. In this paper, we introduce a novel metric to assess the quality of DRs in terms of preserving the structure of data. We model the dimensionality reduction process as a communication channel model transferring data points from a high-dimensional space (input) to a lower one (output). In this model, a co-ranking matrix measures the degree of similarity between the input and the output. Mutual information (MI) and entropy defined over the co-ranking matrix measure the quality of the applied DR technique. We validate our method by reducing the dimension of SIFT and Weber descriptors extracted from Earth Observation (EO) optical images. In our experiments, Laplacian Eigenmaps (LE) and Stochastic Neighbor Embedding (SNE) act as DR techniques. The experimental results demonstrate that the DR technique with the largest MI and entropy preserves the structure of data better than the others.

[1]  M. Pietikäinen,et al.  A robust descriptor based on Weber’s Law , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  John T. Stasko,et al.  An interactive visual testbed system for dimension reduction and clustering of large-scale high-dimensional data , 2013, Electronic Imaging.

[3]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[4]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Jarkko Venna,et al.  Local multidimensional scaling , 2006, Neural Networks.

[6]  I. Hassan Embedded , 2005, The Cyber Security Handbook.

[7]  Min Chen,et al.  An Information-theoretic Framework for Visualization , 2010, IEEE Transactions on Visualization and Computer Graphics.

[8]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[9]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[10]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[11]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Michel Verleysen,et al.  Quality assessment of dimensionality reduction: Rank-based criteria , 2009, Neurocomputing.

[13]  A. Buja,et al.  Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing, and Proximity Analysis , 2009 .

[14]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[15]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[16]  Michel Verleysen,et al.  Rank-based quality assessment of nonlinear dimensionality reduction , 2008, ESANN.

[17]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[18]  Mihai Datcu,et al.  Measuring the semantic gap based on a communication channel model , 2013, 2013 IEEE International Conference on Image Processing.