An Information Retrieval Approach for Finding Dependent Subspaces of Multiple Views

Finding relationships between multiple views of data is essential both for exploratory analysis and as pre-processing for predictive tasks. A prominent approach is to apply variants of Canonical Correlation Analysis (CCA), a classical method seeking correlated components between views. The basic CCA is restricted to maximizing a simple dependency criterion, correlation, measured directly between data coordinates. We introduce a new method that finds dependent subspaces of views directly optimized for the data analysis task of \textit{neighbor retrieval between multiple views}. We optimize mappings for each view such as linear transformations to maximize cross-view similarity between neighborhoods of data samples. The criterion arises directly from the well-defined retrieval task, detects nonlinear and local similarities, is able to measure dependency of data relationships rather than only individual data coordinates, and is related to well understood measures of information retrieval quality. In experiments we show the proposed method outperforms alternatives in preserving cross-view neighborhood similarities, and yields insights into local dependencies between multiple views.

[1]  Michelangelo Ceci,et al.  Semi-Supervised Multi-View Learning for Gene Network Reconstruction , 2015, SEBD.

[2]  Nikos A. Vlassis,et al.  Non-linear CCA and PCA by Alignment of Local Models , 2003, NIPS.

[3]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[4]  Samuel Kaski,et al.  Sparse Nonparametric Topic Model for Transfer Learning , 2012, ESANN.

[5]  Samuel Kaski,et al.  Majorization-Minimization for Manifold Embedding , 2015, AISTATS.

[6]  Jarkko Venna,et al.  Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization , 2010, J. Mach. Learn. Res..

[7]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[8]  Jaakko Peltonen,et al.  Visualization by Linear Projections as Information Retrieval , 2009, WSOM.

[9]  Samuel Kaski,et al.  Generative Modeling for Maximizing Precision and Recall in Information Visualization , 2011, AISTATS.

[10]  Samuel Kaski,et al.  Focused Multi-task Learning Using Gaussian Processes , 2011, ECML/PKDD.

[11]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[12]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[13]  Colin Fyfe,et al.  Kernel and Nonlinear Canonical Correlation Analysis , 2000, IJCNN.

[14]  Gal Chechik,et al.  Euclidean Embedding of Co-occurrence Data , 2004, J. Mach. Learn. Res..

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Raymond D. Kent,et al.  X‐ray microbeam speech production database , 1990 .

[17]  Samuel Kaski,et al.  Scalable Optimization of Neighbor Embedding for Visualization , 2013, ICML.

[18]  Samuel Kaski,et al.  Focused multi-task learning in a Gaussian process framework , 2012, Machine Learning.

[19]  Songcan Chen,et al.  Locality preserving CCA with applications to data visualization and pose estimation , 2007, Image Vis. Comput..

[20]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[21]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[22]  Miguel Á. Carreira-Perpiñán,et al.  Linear-time training of nonlinear low-dimensional embeddings , 2014, AISTATS.

[23]  Jilles Vreeken,et al.  Canonical Divergence Analysis , 2015, ArXiv.

[24]  Jaakko Peltonen,et al.  Information retrieval approach to meta-visualization , 2014, Machine Learning.

[25]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[26]  Bernhard Schölkopf,et al.  Randomized Nonlinear Component Analysis , 2014, ICML.

[27]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[28]  Neil D. Lawrence,et al.  Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays , 2015, Proceedings of the National Academy of Sciences.

[29]  Samuel Kaski,et al.  Simple integrative preprocessing preserves what is shared in data sources , 2008, BMC Bioinformatics.

[30]  Jaakko Peltonen,et al.  Transfer learning using a nonparametric sparse topic model , 2013, Neurocomputing.

[31]  Samuel Kaski,et al.  Optimal Neighborhood Preserving Visualization by Maximum Satisfiability , 2014, AAAI.

[32]  M. Johnson,et al.  Circulating microRNAs in Sera Correlate with Soluble Biomarkers of Immune Activation but Do Not Predict Mortality in ART Treated Individuals with HIV-1 Infection: A Case Control Study , 2015, PloS one.

[33]  Lai Wei,et al.  Local CCA alignment and its applications , 2012, Neurocomputing.

[34]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[35]  Samuel Kaski,et al.  Optimization Equivalence of Divergences Improves Neighbor Embedding , 2014, ICML.

[36]  Samuel Kaski,et al.  Bayesian Canonical correlation analysis , 2013, J. Mach. Learn. Res..

[37]  Peter Young,et al.  Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..

[38]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[39]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.