论文信息 - Wasserstein Stationary Subspace Analysis

Wasserstein Stationary Subspace Analysis

Learning under nonstationarity can be achieved by decomposing the data into a subspace that is stationary and a nonstationary one [stationary subspace analysis (SSA)]. While SSA has been used in various applications, its robustness and computational efficiency have limits due to the difficulty in optimizing the Kullback-Leibler divergence based objective. In this paper, we contribute by extending SSA twofold: we propose SSA with 1) higher numerical efficiency by defining analytical SSA variants and 2) higher robustness by utilizing the Wasserstein-2 distance (Wasserstein SSA). We show the usefulness of our novel algorithms for toy data demonstrating their mathematical properties and for real-world data 1) allowing better segmentation of time series and 2) brain–computer interfacing, where the Wasserstein-based measure of nonstationarity is used for spatial filter regularization and gives rise to higher decoding performance.

[1] Franz J. Király,et al. The Stationary Subspace Analysis Toolbox , 2011, J. Mach. Learn. Res..

[2] J. A. Cuesta-Albertos,et al. A fixed-point approach to barycenters in Wasserstein space , 2015, 1511.05355.

[3] I. Dryden,et al. Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging , 2009, 0910.1656.

[4] Motoaki Kawanabe,et al. Stationary common spatial patterns for brain–computer interfacing , 2012, Journal of neural engineering.

[5] C. Givens,et al. A class of Wasserstein metrics for probability distributions. , 1984 .

[6] Pablo Laguna,et al. Principal Component Analysis in ECG Signal Processing , 2007, EURASIP J. Adv. Signal Process..

[7] Motoaki Kawanabe,et al. Improving Classification Performance of BCIs by Using Stationary Common Spatial Patterns and Unsupervised Bias Adaptation , 2011, HAIS.

[8] R. Bhatia,et al. On the Bures–Wasserstein distance between positive definite matrices , 2017, Expositiones Mathematicae.

[9] Yoshinobu Kawahara,et al. Separation of stationary and non-stationary sources with a generalized eigenvalue problem , 2012, Neural Networks.

[10] K. Müller,et al. Finding stationary subspaces in multivariate time series. , 2009, Physical review letters.

[11] Motoaki Kawanabe,et al. Brain-computer interfacing in discriminative and stationary subspaces , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[12] Sim Heng Ong,et al. Discriminative Learning of Propagation and Spatial Pattern for Motor Imagery EEG Analysis , 2013, Neural Computation.

[13] M. Kawanabe,et al. Higher order stationary subspace analysis , 2016 .

[14] Ping-Keng Jao,et al. Improving Cross-Day EEG-Based Emotion Classification Using Robust Principal Component Analysis , 2017, Front. Comput. Neurosci..

[15] Masashi Sugiyama,et al. Geometry-aware stationary subspace analysis , 2016, ACML.

[16] Robert Savit,et al. Stationarity and nonstationarity in time series analysis , 1996 .

[17] Motoaki Kawanabe,et al. On robust parameter estimation in brain–computer interfacing , 2017, Journal of neural engineering.

[18] M. Knott,et al. On the optimal mapping of distributions , 1984 .

[19] Mo Chen,et al. Low-rank representation of neural activity and detection of submovements , 2013, 52nd IEEE Conference on Decision and Control.

[20] Sergio Cruces,et al. Log-Determinant Divergences Revisited: Alpha-Beta and Gamma Log-Det Divergences , 2014, Entropy.

[21] C. Villani. Topics in Optimal Transportation , 2003 .

[22] Franz J. Király,et al. Algebraic Geometric Comparison of Probability Distributions , 2012, J. Mach. Learn. Res..

[23] Yoav Zemel,et al. Procrustes Metrics on Covariance Operators and Optimal Transportation of Gaussian Processes , 2018, Sankhya A.

[24] Brian C. Lovell,et al. Non-Linear Stationary Subspace Analysis with Application to Video Classification , 2013, ICML.

[25] K.-R. Muller,et al. Optimizing Spatial filters for Robust EEG Single-Trial Analysis , 2008, IEEE Signal Processing Magazine.

[26] Motoaki Kawanabe,et al. Stationary Common Spatial Patterns: Towards robust classification of non-stationary EEG signals , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27] Piercesare Secchi,et al. Distances and inference for covariance operators , 2014 .

[28] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[29] Motoaki Kawanabe,et al. Divergence-Based Framework for Common Spatial Patterns Algorithms , 2014, IEEE Reviews in Biomedical Engineering.

[30] Motoaki Kawanabe,et al. In Search of Non-Gaussian Components of a High-Dimensional Distribution , 2006, J. Mach. Learn. Res..

[31] N. Ayache,et al. Log‐Euclidean metrics for fast and simple calculus on diffusion tensors , 2006, Magnetic resonance in medicine.

[32] J. Gower. Generalized procrustes analysis , 1975 .

[33] Nicholas Ayache,et al. Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices , 2007, SIAM J. Matrix Anal. Appl..

[34] Asuka Takatsu. Wasserstein geometry of Gaussian measures , 2011 .

[35] Motoaki Kawanabe,et al. Robust common spatial filters with a maxmin approach , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[36] D. Kendall. A Survey of the Statistical Theory of Shape , 1989 .

[37] Y. Zemel. Fréchet means in Wasserstein space , 2017 .

[38] Cuntai Guan,et al. Regularizing Common Spatial Patterns to Improve BCI Designs: Unified Theory and New Algorithms , 2011, IEEE Transactions on Biomedical Engineering.

[39] Klaus-Robert Müller,et al. Feature Extraction for Change-Point Detection Using Stationary Subspace Analysis , 2011, IEEE Transactions on Neural Networks and Learning Systems.

[40] Klaus-Robert Müller,et al. Neurophysiological predictor of SMR-based BCI performance , 2010, NeuroImage.

[41] W. Samek,et al. Group-wise Stationary Subspace Analysis-A novel method for studying non-stationarities , 2011 .

[42] Sergio Cruces,et al. Optimization of Alpha-Beta Log-Det Divergences and their Application in the Spatial Filtering of Two Class Motor Imagery Movements , 2017, Entropy.

[43] Jean-Michel Loubes,et al. A Gaussian Process Regression Model for Distribution Inputs , 2017, IEEE Transactions on Information Theory.

[44] D. Dowson,et al. The Fréchet distance between multivariate normal distributions , 1982 .

[45] Aasa Feragen,et al. Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes , 2017, NIPS.

[46] Klaus-Robert Müller,et al. Wasserstein Training of Restricted Boltzmann Machines , 2016, NIPS.

[47] Terrence J. Sejnowski,et al. Toward Brain-Computer Interfacing (Neural Information Processing) , 2007 .

[48] Cuntai Guan,et al. Optimizing Spatial Filters by Minimizing Within-Class Dissimilarities in Electroencephalogram-Based Brain–Computer Interface , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[49] Motoaki Kawanabe,et al. Robust Spatial Filtering with Beta Divergence , 2013, NIPS.

[50] Motoaki Kawanabe,et al. An Information Geometrical View of Stationary Subspace Analysis , 2011, ICANN.

[51] Klaus-Robert Muller,et al. Finding stationary brain sources in EEG data , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.