Kernel and Dissimilarity Methods for Exploratory Analysis in a Social Context

While most of the statistical methods for prediction or data mining have been built for data made of independent observations of a common set of p numerical variables, many real-world applications do not fit in this framework. A more common and general situation is the case where a relevant similarity or dissimilarity can be computed between the observations, providing a summary of their relations to each other. This setting is related to the kernel framework that has allowed to extend most of standard statistical supervised and unsupervised methods to any type of data for which a relevant such kernel can be obtained. The present chapter aims at presenting kernel methods in general, with a specific focus on the less studied unsupervised framework. We illustrate its usefulness by describing the extension of self-organizing maps and by proposing an approach to combine kernels in an efficient way. The overall approach is illustrated on categorical time series in a social-science context and allows to illustrate how the choice of a given type of dissimilarity or group of dissimilarities can influence the output of the exploratory analysis.

[1]  Nathalie Villa-Vialaneix,et al.  Unsupervised multiple kernel learning for heterogeneous data integration , 2017, bioRxiv.

[2]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[3]  Mehmet Gönen,et al.  Localized Data Fusion for Kernel k-Means Clustering with Application to Cancer Biology , 2014, NIPS.

[4]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[5]  Madalina Olteanu,et al.  Efficient interpretable variants of online SOM for large dissimilarity data , 2017, Neurocomputing.

[6]  R. Knight,et al.  Quantitative and Qualitative β Diversity Measures Lead to Different Insights into Factors That Structure Microbial Communities , 2007, Applied and Environmental Microbiology.

[7]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[8]  Madalina Olteanu,et al.  Accelerating stochastic kernel SOM , 2017, ESANN.

[9]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[10]  Bin Zhao,et al.  Multiple Kernel Clustering , 2009, SDM.

[11]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[12]  Laurent Lesnard,et al.  Setting Cost in Optimal Matching to Uncover Contemporaneous Socio-Temporal Patterns , 2010 .

[13]  Jean-Philippe Vert,et al.  Extracting active pathways from gene expression data , 2003, ECCB.

[14]  Nathalie Villa-Vialaneix,et al.  Stochastic self-organizing map variants with the R package SOMbrero , 2017, 2017 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM).

[15]  A. Abbott,et al.  Sequence Analysis and Optimal Matching Methods in Sociology , 2000 .

[16]  Bo Wang,et al.  Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning , 2016, Nature Methods.

[17]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[18]  Marie Cottrell,et al.  How to use the Kohonen algorithm to simultaneously analyze individuals and modalities in a survey , 2005, Neurocomputing.

[19]  Robert Sabatier,et al.  The ACT (STATIS method) , 1994 .

[20]  Johan A. K. Suykens,et al.  Optimized Data Fusion for Kernel k-Means Clustering , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Nico Pfeifer,et al.  Towards Multiple Kernel Principal Component Analysis for Integrative Analysis of Tumor Samples , 2017, J. Integr. Bioinform..

[22]  Hongzhe Li,et al.  Associating microbiome composition with environmental covariates using generalized UniFrac distances , 2012, Bioinform..

[23]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[24]  Gilbert Ritschard,et al.  Analyzing and Visualizing State Sequences in R with TraMineR , 2011 .

[25]  Lev Goldfarb,et al.  A unified approach to pattern recognition , 1984, Pattern Recognit..

[26]  Barbara Hammer,et al.  Efficient approximations of robust soft learning vector quantization for non-vectorial data , 2015, Neurocomputing.

[27]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[28]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[29]  Nico Pfeifer,et al.  Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery , 2015, Bioinform..

[30]  Barbara Hammer,et al.  Topographic Mapping of Large Dissimilarity Data Sets , 2010, Neural Computation.

[31]  Matthias Studer,et al.  Spell Sequences, State Proximities, and Distance Metrics , 2015 .

[32]  L. Bergroth,et al.  A survey of longest common subsequence algorithms , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[33]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[34]  Emmanuel Barillot,et al.  Classification of microarray data using gene networks , 2007, BMC Bioinformatics.

[35]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[36]  Colin Fyfe,et al.  The kernel self-organising map , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[37]  A. Abbott,et al.  Optimal Matching Methods for Historical Sequences , 1986 .

[38]  Ingo Steinwart,et al.  Support Vector Machines are Universally Consistent , 2002, J. Complex..

[39]  Fabrice Rossi,et al.  Batch kernel SOM and related Laplacian methods for social network analysis , 2008, Neurocomputing.

[40]  Panu Somervuo,et al.  Self-organizing maps of symbol strings , 1998, Neurocomputing.

[41]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[42]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[43]  Peter Tiño,et al.  Indefinite Proximity Learning: A Review , 2015, Neural Computation.

[44]  Yung-Yu Chuang,et al.  Multiple Kernel Fuzzy Clustering , 2012, IEEE Transactions on Fuzzy Systems.

[45]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[46]  Alexander J. Smola,et al.  Learning with non-positive kernels , 2004, ICML.

[47]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[48]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[49]  Gilbert Ritschard,et al.  What matters in differences between life trajectories: a comparative review of sequence dissimilarity measures , 2016 .

[50]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[51]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[52]  Chiou-Shann Fuh,et al.  Multiple Kernel Learning for Dimensionality Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[54]  Sébastien Massoni,et al.  Which Dissimilarity Is to Be Used When Extracting Typologies in Sequence Analysis? A Comparative Study , 2013, IWANN.

[55]  Fabrice Rossi How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need? , 2014, WSOM.

[56]  Maya R. Gupta,et al.  Similarity-based Classification: Concepts and Algorithms , 2009, J. Mach. Learn. Res..