Applying dimension reduction to EEG data by Principal Component Analysis reduces the quality of its subsequent Independent Component decomposition

&NA; Independent Component Analysis (ICA) has proven to be an effective data driven method for analyzing EEG data, separating signals from temporally and functionally independent brain and non‐brain source processes and thereby increasing their definition. Dimension reduction by Principal Component Analysis (PCA) has often been recommended before ICA decomposition of EEG data, both to minimize the amount of required data and computation time. Here we compared ICA decompositions of fourteen 72‐channel single subject EEG data sets obtained (i) after applying preliminary dimension reduction by PCA, (ii) after applying no such dimension reduction, or else (iii) applying PCA only. Reducing the data rank by PCA (even to remove only 1% of data variance) adversely affected both the numbers of dipolar independent components (ICs) and their stability under repeated decomposition. For example, decomposing a principal subspace retaining 95% of original data variance reduced the mean number of recovered ‘dipolar’ ICs from 30 to 10 per data set and reduced median IC stability from 90% to 76%. PCA rank reduction also decreased the numbers of near‐equivalent ICs across subjects. For instance, decomposing a principal subspace retaining 95% of data variance reduced the number of subjects represented in an IC cluster accounting for frontal midline theta activity from 11 to 5. PCA rank reduction also increased uncertainty in the equivalent dipole positions and spectra of the IC brain effective sources. These results suggest that when applying ICA decomposition to EEG data, PCA rank reduction should best be avoided. HighlightsIt is currently a common practice to apply dimension reduction to EEG data using PCA before performing ICA decomposition.We tested the quality of Independent Components (ICs) after different levels of rank reduction to a principal subspace.PCA rank reduction adversely affected dipolarity and stability of ICs accounting for brain and known non‐brain processes.PCA rank reduction also increased inter‐subject variance in IC source locations (by equivalent dipole fitting) and spectra.For EEG data at least, PCA rank reduction should be avoided or carefully tested before applying it as a preprocessing step.

[1]  T. Sejnowski,et al.  Removing electroencephalographic artifacts by blind source separation. , 2000, Psychophysiology.

[2]  Joseph Dien,et al.  Evaluation of PCA and ICA of simulated ERPs: Promax vs. infomax rotations , 2007, Human brain mapping.

[3]  Richard M. Leahy,et al.  Brainstorm: A User-Friendly Application for MEG/EEG Analysis , 2011, Comput. Intell. Neurosci..

[4]  Scott Makeig,et al.  High-frequency Broadband Modulations of Electroencephalographic Spectra , 2009, Front. Hum. Neurosci..

[5]  Silvestro Micera,et al.  RELICA: A method for estimating the reliability of independent components , 2014, NeuroImage.

[6]  Silvestro Micera,et al.  Unidirectional brain to muscle connectivity reveals motor cortex control of leg muscles during stereotyped walking , 2017, NeuroImage.

[7]  Klaus-Robert Müller,et al.  On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[8]  Erkki Oja,et al.  Independent Component Analysis in Wave Decomposition of Auditory Evoked Fields , 1998 .

[9]  Arnaud Delorme,et al.  EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis , 2004, Journal of Neuroscience Methods.

[10]  J. Nadal,et al.  Application of principal component analysis in vertical ground reaction force to discriminate normal and abnormal gait. , 2009, Gait & posture.

[11]  R. Oostenveld,et al.  Independent EEG Sources Are Dipolar , 2012, PloS one.

[12]  T. Sejnowski,et al.  Dynamic Brain Sources of Visual Evoked Responses , 2002, Science.

[13]  M. Scherg,et al.  Evoked dipole source potentials of the human auditory cortex. , 1986, Electroencephalography and clinical neurophysiology.

[14]  H. Stern,et al.  The use of multiple imputation for the analysis of missing data. , 2001, Psychological methods.

[15]  B Bromm,et al.  Principal component analysis of pain-related cerebral potentials to mechanical and electrical stimulation in man. , 1982, Electroencephalography and clinical neurophysiology.

[16]  Ryan B Graham,et al.  Differentiation of young and older adult stair climbing gait using principal component analysis. , 2010, Gait & posture.

[17]  M. Tangermann,et al.  Automatic Classification of Artifactual ICA-Components for Artifact Removal in EEG Signals , 2011, Behavioral and Brain Functions.

[18]  Arnaud Delorme,et al.  Frontal midline EEG dynamics during working memory , 2005, NeuroImage.

[19]  S. Kuriki,et al.  Principal component elimination method for the improvement of S/N in evoked neuromagnetic field measurements , 1999, IEEE Transactions on Biomedical Engineering.

[20]  Silvestro Micera,et al.  Selecting the best number of synergies in gait: Preliminary results on young and elderly people , 2013, 2013 IEEE 13th International Conference on Rehabilitation Robotics (ICORR).

[21]  T. Lagerlund,et al.  Spatial filtering of multichannel electroencephalographic recordings through principal component analysis by singular value decomposition. , 1997, Journal of clinical neurophysiology : official publication of the American Electroencephalographic Society.

[22]  Angelo Gemignani,et al.  ErpICASSO: A tool for reliability estimates of independent components in EEG event-related analysis , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[23]  Victoria L Chester,et al.  The identification of age-related differences in kinetic gait parameters using principal component analysis. , 2008, Clinical biomechanics.

[24]  S Makeig,et al.  Analysis of fMRI data by blind separation into independent spatial components , 1998, Human brain mapping.

[25]  Scott Makeig,et al.  MEG/EEG Data Analysis Using EEGLAB , 2019, Magnetoencephalography.

[26]  Martin J. McKeown,et al.  Removing electroencephalographic artifacts: comparison between ICA and PCA , 1998, Neural Networks for Signal Processing VIII. Proceedings of the 1998 IEEE Signal Processing Society Workshop (Cat. No.98TH8378).

[27]  Robert M Hamer,et al.  Last observation carried forward versus mixed models in the analysis of psychiatric clinical trials. , 2009, The American journal of psychiatry.

[28]  A. Bowman,et al.  Applied smoothing techniques for data analysis : the kernel approach with S-plus illustrations , 1999 .

[29]  Terrence J. Sejnowski,et al.  Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis , 2007, NeuroImage.

[30]  Richard D. Deveaux,et al.  Applied Smoothing Techniques for Data Analysis , 1999, Technometrics.

[31]  Marta Kutas,et al.  Identifying reliable independent components via split-half comparisons , 2009, NeuroImage.

[32]  Andreas Daffertshofer,et al.  Improving EMG-based muscle force estimation by using a high-density EMG grid and principal component analysis , 2006, IEEE Transactions on Biomedical Engineering.

[33]  Paul L. Nunez,et al.  A Study of Origins of the Time Dependencies of Scalp EEG: I - Theoretical Basis , 1981, IEEE Transactions on Biomedical Engineering.

[34]  Aapo Hyvärinen,et al.  Validating the independent components of neuroimaging time series via clustering and visualization , 2004, NeuroImage.

[35]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[36]  Matt Stead,et al.  Human Neuroscience , 2022 .

[37]  R. Verleger,et al.  Principal component analysis of event-related potentials: a note on misallocation of variance. , 1986, Electroencephalography and clinical neurophysiology.

[38]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[39]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[40]  C. Vaughan,et al.  Phasic behavior of EMG signals during gait: Use of multivariate statistics. , 1993, Journal of electromyography and kinesiology : official journal of the International Society of Electrophysiological Kinesiology.

[41]  S. Makeig,et al.  Imaging human EEG dynamics using independent component analysis , 2006, Neuroscience & Biobehavioral Reviews.

[42]  Abbas Erfanian,et al.  A fully automatic ocular artifact suppression from EEG data using higher order statistics: improved performance by wavelet analysis. , 2010, Medical engineering & physics.

[43]  Scott Makeig,et al.  A visual working memory dataset collection with bootstrap Independent Component Analysis for comparison of electroencephalographic preprocessing pipelines , 2018, Data in brief.

[44]  C. Tenke,et al.  Consensus on PCA for ERP data, and sensibility of unrestricted solutions , 2006, Clinical Neurophysiology.

[45]  S. Makeig,et al.  Mining event-related brain dynamics , 2004, Trends in Cognitive Sciences.

[46]  Scott Makeig,et al.  Simultaneous head tissue conductivity and EEG source location estimation , 2016, NeuroImage.

[47]  R. Oostenveld,et al.  Validating the boundary element method for forward and inverse EEG computations in the presence of a hole in the skull , 2002, Human brain mapping.

[48]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[49]  S. Cerutti,et al.  Principal component analysis for reduction of ocular artefacts in event-related potentials of normal and dyslexic children , 2004, Clinical Neurophysiology.

[50]  Scott Makeig,et al.  BCILAB: a platform for brain–computer interface development , 2013, Journal of neural engineering.

[51]  Nanda Kambhatla,et al.  Dimension Reduction by Local Principal Component Analysis , 1997, Neural Computation.

[52]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[53]  Robert Oostenveld,et al.  FieldTrip: Open Source Software for Advanced Analysis of MEG, EEG, and Invasive Electrophysiological Data , 2010, Comput. Intell. Neurosci..

[54]  F. Lacquaniti,et al.  Five basic muscle activation patterns account for muscle activity during human locomotion , 2004, The Journal of physiology.

[55]  R. Shiavi,et al.  Representing and clustering electromyographic gait patterns with multivariate techniques , 1981, Medical and Biological Engineering and Computing.

[56]  Tzyy-Ping Jung,et al.  Independent Component Analysis of Electroencephalographic Data , 1995, NIPS.