A measure of association between vectors based on "similarity covariance"

The "maximum similarity correlation" definition introduced in this study is motivated by the seminal work of Szekely et al on "distance covariance" (Ann. Statist. 2007, 35: 2769-2794; Ann. Appl. Stat. 2009, 3: 1236-1265). Instead of using Euclidean distances "d" as in Szekely et al, we use "similarity", which can be defined as "exp(-d/s)", where the scaling parameter s>0 controls how rapidly the similarity falls off with distance. Scale parameters are chosen by maximizing the similarity correlation. The motivation for using "similarity" originates in spectral clustering theory (see e.g. Ng et al 2001, Advances in Neural Information Processing Systems 14: 849-856). We show that a particular form of similarity correlation is asymptotically equivalent to distance correlation for large values of the scale parameter. Furthermore, we extend similarity correlation to coherence between complex valued vectors, including its partitioning into real and imaginary contributions. Several toy examples are used for comparing distance and similarity correlations. For instance, points on a noiseless straight line give distance and similarity correlation values equal to 1; but points on a noiseless circle produces near zero distance correlation (dCorr=0.02) while the similarity correlation is distinctly non zero (sCorr=0.36). In distinction to the distance approach, similarity gives more importance to small distances, which emphasizes the local properties of functional relations. This paper represents a preliminary empirical study, showing that the novel similarity association has some distinct practical advantages over distance based association.For the sake of reproducible research, the software code implementing all methods here (using lazarus free-pascal "www.lazarus.freepascal.org"), including all test data, are freely available at: "sites.google.com/site/pascualmarqui/home/similaritycovariance".

[1]  Zhou Zhou Measuring nonlinear dependence in time‐series, a distance correlation approach , 2012 .

[2]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[3]  Rolando J. Biscay-Lirio,et al.  Assessing interactions in the brain with exact low-resolution electromagnetic tomography , 2011, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[4]  Maria L. Rizzo,et al.  Brownian distance covariance , 2009, 1010.0297.

[5]  Maria L. Rizzo,et al.  Rejoinder: Brownian distance covariance , 2009, 1010.0844.

[6]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[7]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[8]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[9]  R D Pascual-Marqui,et al.  Standardized low-resolution brain electromagnetic tomography (sLORETA): technical details. , 2002, Methods and findings in experimental and clinical pharmacology.

[10]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[11]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[12]  D. Brillinger Time series - data analysis and theory , 1981, Classics in applied mathematics.

[13]  D. Lehmann,et al.  Low resolution electromagnetic tomography: a new method for localizing electrical activity in the brain. , 1994, International journal of psychophysiology : official journal of the International Organization of Psychophysiology.

[14]  Y. Escoufier LE TRAITEMENT DES VARIABLES VECTORIELLES , 1973 .