FINE: Fisher Information Nonparametric Embedding

We consider the problems of clustering, classification, and visualization of high-dimensional data when no straightforward Euclidean representation exists. In this paper, we propose using the properties of information geometry and statistical manifolds in order to define similarities between data sets using the Fisher information distance. We will show that this metric can be approximated using entirely nonparametric methods, as the parameterization and geometry of the manifold is generally unknown. Furthermore, by using multidimensional scaling methods, we are able to reconstruct the statistical manifold in a low-dimensional Euclidean space; enabling effective learning on the data. As a whole, we refer to our framework as Fisher information nonparametric embedding (FINE) and illustrate its uses on practical problems, including a biomedical application and document classification.

[1]  Sueli I. Rodrigues Costa,et al.  Fisher information matrix and hyperbolic geometry , 2005, IEEE Information Theory Workshop, 2005..

[2]  R. Kass,et al.  Geometrical Foundations of Asymptotic Inference , 1997 .

[3]  Alexander J. Smola,et al.  Neural Information Processing Systems , 1997, NIPS 1997.

[4]  R. Kass,et al.  Geometrical Foundations of Asymptotic Inference: Kass/Geometrical , 1997 .

[5]  Alfred O. Hero,et al.  Classification constrained dimensionality reduction , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Raviv Raich,et al.  Analysis of clinical flow cytometric immunophenotyping data by clustering on statistical manifolds: Treating flow cytometry data as high‐dimensional objects , 2009, Cytometry. Part B, Clinical cytometry.

[7]  Hyunsoo Kim,et al.  Dimension Reduction in Text Classification with Support Vector Machines , 2005, J. Mach. Learn. Res..

[8]  Chen-Hsiang Yeang,et al.  An Information Geometric Perspective on Active Learning , 2002, ECML.

[9]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[10]  Anuj Srivastava,et al.  Riemannian Analysis of Probability Density Functions with Applications in Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[12]  Gábor Lugosi,et al.  Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[13]  Rama Chellappa,et al.  From sample similarity to ensemble similarity: probabilistic distance measures in reproducing kernel Hilbert space , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Guy Lebanon Information Geometry, the Embedding Principle, and Document Classification , 2005 .

[15]  Shiping Huang,et al.  Exploration of dimensionality reduction for text visualization , 2005, Coordinated and Multiple Views in Exploratory Visualization (CMV'05).

[16]  Junmo Kim,et al.  Nonparametric statistical methods for image segmentation and shape analysis , 2005 .

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[18]  Alfred O. Hero,et al.  Information Preserving Component Analysis: Data Projections for Flow Cytometry Analysis , 2008, IEEE Journal of Selected Topics in Signal Processing.

[19]  Alfred O. Hero,et al.  Fine: Information embedding for document classification , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[21]  Kevin M. Carter,et al.  Dimensionality reduction on statistical manifolds , 2009 .

[22]  John D. Lafferty,et al.  Diffusion Kernels on Statistical Manifolds , 2005, J. Mach. Learn. Res..

[23]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Alfred O. Hero,et al.  On Dimensionality Reduction for Classification and its Application , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[25]  A. Hero,et al.  De-Biasing for Intrinsic Dimension Estimation , 2007, 2007 IEEE/SP 14th Workshop on Statistical Signal Processing.

[26]  A. Ceña,et al.  Geometric structures on the non-parametric statistical manifold , 2003 .

[27]  Giovanni Pistone,et al.  The Exponential Statistical Manifold: Mean Parameters, Orthogonality and Space Transformations , 1999 .

[28]  A. Lynn Abbott,et al.  Active contours on statistical manifolds and texture segmentation , 2005, IEEE International Conference on Image Processing 2005.

[29]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[30]  A. Hero,et al.  LEARNING ON STATISTICAL MANIFOLDS FOR CLUSTERING AND VISUALIZATION , 2007 .

[31]  Guy Lebanon Axiomatic geometry of conditional models , 2005, IEEE Transactions on Information Theory.

[32]  G. Terrell The Maximal Smoothing Principle in Density Estimation , 1990 .

[33]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[34]  Samuel Kaski,et al.  Discriminative Clustering in Fisher Metrics , 2003 .

[35]  A. Lynn Abbott,et al.  Dimensionality Reduction and Clustering on Statistical Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.