Manifold Alignment Preserving Global Geometry

This paper proposes a novel algorithm for manifold alignment preserving global geometry. This approach constructs mapping functions that project data instances from different input domains to a new lower-dimensional space, simultaneously matching the instances in correspondence and preserving global distances between instances within the original domains. In contrast to previous approaches, which are largely based on preserving local geometry, the proposed approach is suited to applications where the global manifold geometry needs to be respected. We evaluate the effectiveness of our algorithm for transfer learning in two real-world cross-lingual information retrieval tasks.

[1]  Nick Cercone,et al.  Computational Linguistics , 1986, Communications in Computer and Information Science.

[2]  Jiawei Han,et al.  Isometric Projection , 2007, AAAI.

[3]  Sridhar Mahadevan,et al.  Manifold alignment using Procrustes analysis , 2008, ICML '08.

[4]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[5]  Ronald R. Coifman,et al.  Data Fusion and Multicue Data Matching by Diffusion Maps , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ameet Talwalkar,et al.  Large-scale manifold learning , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Daniel D. Lee,et al.  Semisupervised alignment of manifolds , 2005, AISTATS.

[8]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[9]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[10]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[11]  Gavriel Salomon,et al.  T RANSFER OF LEARNING , 1992 .

[12]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[13]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[14]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[15]  Fernando Diaz,et al.  Pseudo-Aligned Multilingual Corpora , 2007, IJCAI.

[16]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[17]  Noah A. Smith,et al.  The Web as a Parallel Corpus , 2003, CL.