The e-PCA and m-PCA: dimension reduction of parameters by information geometry

We propose a method for extracting a low dimensional structure from a set of parameters of probability distributions. By an information geometrical interpretation, we show that there exist two kinds of possible flat structures for fitting (e-PCA and m-PCA). We derive alternating procedures to find the low dimensional structures. Each alternating procedure can be written in a nonlinear equation. It can be solved analytically in some special cases. Otherwise, we need to apply gradient type methods that we also derive. Since the overall algorithm may converge to a local optimum, we propose a method to find a good initial solution by using metric information.

[1]  Shun-ichi Amari,et al.  Differential-geometrical methods in statistics , 1985 .

[2]  Shun-ichi Amari,et al.  Information geometry of the EM and em algorithms for neural networks , 1995, Neural Networks.

[3]  A. Ohara Information Geometric Analysis of a Interior-Point Method for Semidefinite Programming , 1997 .

[4]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[5]  Toshiyuki Tanaka,et al.  Information Geometry of Mean-Field Approximation , 2000, Neural Computation.

[6]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[7]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[8]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[9]  Shun-ichi Amari,et al.  Information Geometrical Framework for Analyzing Belief Propagation Decoder , 2001, NIPS.

[10]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[11]  Shun-ichi Amari,et al.  Information geometry on hierarchy of probability distributions , 2001, IEEE Trans. Inf. Theory.

[12]  Kiyoshi Asai,et al.  The em Algorithm for Kernel Matrix Completion with Auxiliary Data , 2003, J. Mach. Learn. Res..

[13]  Shotaro Akaho SVM that maximizes the margin in the input space , 2004, Systems and Computers in Japan.

[14]  Takafumi Kanamori,et al.  Information Geometry of U-Boost and Bregman Divergence , 2004, Neural Computation.