Correlation Analysis in Contaminated Data by Singular Spectrum Analysis

Correlation analysis is one of the standard and most informative descriptive statistical tools when studying relationships between variables in bivariate and multivariate data. However, when data is contaminated with outlying observations, the standard Pearson correlation might be misleading and result in erroneous outcomes. In this paper, we propose three new approaches to find linear correlation based on the nonparametric method designed to analyse time series data, the singular spectrum analysis. In these proposals, the correlation is obtained after removing the noise from the data by using singular spectrum analysis based methods. The usefulness of our proposals in contaminated data is assessed by Monte Carlo simulation with different schemes of contamination, and with applications to real data on aluminium industry and synthetic sparse data. In addition, the model comparisons are made with robust hybrid filtering methods. Copyright © 2016 John Wiley & Sons, Ltd.

[1]  Clayton V. Deutsch,et al.  Calculating a robust correlation coefficient and quantifying its uncertainty , 2012, Comput. Geosci..

[2]  P. Rousseeuw,et al.  Least median of squares: a robust method for outlier and model error detection in regression and calibration , 1986 .

[3]  Ursula Gather,et al.  Robust filters for intensive care monitoring: beyond the running median / Robuste Filter für intensivmedizinisches Monitoring: mehr als ein gleitender Median , 2006, Biomedizinische Technik. Biomedical engineering.

[4]  G. Giakas,et al.  A comparison of automatic filtering techniques applied to biomechanical walking data. , 1997, Journal of biomechanics.

[5]  Kenji Kume,et al.  Multidimensional Extension of singular Spectrum Analysis Based on Filtering Interpretation , 2014, Adv. Data Sci. Adapt. Anal..

[6]  Marco S. Reis,et al.  A Comparison Study of Single‐Scale and Multiscale Approaches for Data‐Driven and Model‐Based Online Denoising , 2014, Qual. Reliab. Eng. Int..

[7]  Guillaume A. Rousselet,et al.  Improving standards in brain-behavior correlation analyses , 2012, Front. Hum. Neurosci..

[8]  Hossein Hassani,et al.  MULTIVARIATE SINGULAR SPECTRUM ANALYSIS: A GENERAL VIEW AND NEW VECTOR FORECASTING APPROACH , 2013 .

[9]  F. J. Alonso,et al.  Application of singular spectrum analysis to the smoothing of raw kinematic signals. , 2005, Journal of biomechanics.

[10]  Mokhtar Abdullah,et al.  On a Robust Correlation Coefficient , 1990 .

[11]  Abdol S. Soofi,et al.  Modelling and Forecasting Financial Data , 2002 .

[12]  Rand R. Wilcox,et al.  Inferences Based on a Skipped Correlation Coefficient , 2004 .

[13]  R. Wilcox The percentage bend correlation coefficient , 1994 .

[14]  Jinwu Xu LOCAL PROJECTIVE METHOD AND IT'S APPLICATION ON NONLINEAR TIME SERIES , 2003 .

[15]  Shirley Coleman,et al.  A Little‐known Robust Estimator of the Correlation Coefficient and Its Use in a Robust Graphical Test for Bivariate Normality with Applications in the Aluminium Industry , 2004 .

[16]  Nina Golyandina,et al.  On the choice of parameters in Singular Spectrum Analysis and related subspace-based methods , 2010, 1005.4374.

[17]  Ursula Gather,et al.  Repeated median and hybrid filters , 2006, Comput. Stat. Data Anal..

[18]  Rahim Mahmoudvand,et al.  Separability and window length in singular spectrum analysis , 2011 .