Twin Kernel Embedding

In most existing dimensionality reduction algorithms, the main objective is to preserve relational structure among objects of the input space in a low dimensional embedding space. This is achieved by minimizing the inconsistency between two similarity/dissimilarity measures, one for the input data and the other for the embedded data, via a separate matching objective function. Based on this idea, a new dimensionality reduction method called twin kernel embedding (TKE) is proposed. TKE addresses the problem of visualizing non-vectorial data that is difficult for conventional methods in practice due to the lack of efficient vectorial representation. TKE solves this problem by minimizing the inconsistency between the similarity measures captured respectively by their kernel gram matrices in the two spaces. In the implementation, by optimizing a nonlinear objective function using the gradient descent algorithm, a local minimum can be reached. The results obtained include both the optimal similarity preserving embedding and the appropriate values for the hyperparameters of the kernel. Experimental evaluation on real non-vectorial datasets confirmed the effectiveness of TKE. TKE can be applied to other types of data beyond those mentioned in this paper whenever suitable measures of similarity/dissimilarity can be defined on the input data.

[1]  Tina Eliassi-Rad,et al.  An Examination of Experimental Methodology for Classifiers of Relational Data , 2007 .

[2]  Gal Chechik,et al.  Euclidean Embedding of Co-occurrence Data , 2004, J. Mach. Learn. Res..

[3]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[4]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[5]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[6]  Junbin Gao,et al.  Kernel Laplacian Eigenmaps for Visualization of Non-vectorial Data , 2006, Australian Conference on Artificial Intelligence.

[7]  Yi Guo,et al.  An Integration of Shape Context and Semigroup Kernel in Image Classification , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[8]  Junbin Gao,et al.  Twin Kernel Embedding with Back Constraints , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[11]  James M. Keller,et al.  Fuzzy Measures on the Gene Ontology for Gene Product Similarity , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[13]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[14]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[15]  Junbin Gao,et al.  Visualization of Non-vectorial Data Using Twin Kernel Embedding , 2006, 2006 International Workshop on Integrating AI and Data Mining.

[16]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[17]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[18]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[19]  Lior Wolf,et al.  Learning over Sets using Kernel Principal Angles , 2003, J. Mach. Learn. Res..

[20]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[21]  Kenji Fukumizu,et al.  Semigroup Kernels on Measures , 2005, J. Mach. Learn. Res..

[22]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[23]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.