Population Value Decomposition, a Framework for the Analysis of Image Populations

Images, often stored in multidimensional arrays, are fast becoming ubiquitous in medical and public health research. Analyzing populations of images is a statistical problem that raises a host of daunting challenges. The most significant challenge is the massive size of the datasets incorporating images recorded for hundreds or thousands of subjects at multiple visits. We introduce the population value decomposition (PVD), a general method for simultaneous dimensionality reduction of large populations of massive images. We show how PVD can be seamlessly incorporated into statistical modeling, leading to a new, transparent, and rapid inferential framework. Our PVD methodology was motivated by and applied to the Sleep Heart Health Study, the largest community-based cohort study of sleep containing more than 85 billion observations on thousands of subjects at two visits. This article has supplementary material online.

[1]  S. Redline,et al.  Reliability of scoring respiratory disturbance indices and sleep staging. , 1998, Sleep.

[2]  W C Dement,et al.  Determinants of daytime sleepiness in obstructive sleep apnea. , 1988, Chest.

[3]  Brian S Caffo,et al.  Nonparametric Signal Extraction and Measurement Error in the Analysis of Electroencephalographic Activity During Sleep , 2009, Journal of the American Statistical Association.

[4]  H. Müller,et al.  Functional Data Analysis for Sparse Longitudinal Data , 2005 .

[5]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[6]  Bonnie K. Lind,et al.  Methods for obtaining and analyzing unattended polysomnography data for a multicenter study. Sleep Heart Health Research Group. , 1998, Sleep.

[7]  R. Christensen,et al.  Advanced Linear Modeling , 2002, Springer Texts in Statistics.

[8]  I. Jolliffe Principal Component Analysis , 2002 .

[9]  J. Samet,et al.  The Sleep Heart Health Study: design, rationale, and methods. , 1997, Sleep.

[10]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[11]  Ciprian M. Crainiceanu,et al.  Two-stage decompositions for the analysis of functional connectivity for fMRI with application to Alzheimer's disease risk , 2010, NeuroImage.

[12]  N J Douglas,et al.  Does arousal frequency predict daytime function? , 1998, The European respiratory journal.

[13]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[14]  N J Douglas,et al.  Factors impairing daytime performance in patients with sleep apnea/hypopnea syndrome. , 1992, Archives of internal medicine.

[15]  Ciprian M Crainiceanu,et al.  Bayesian Functional Data Analysis Using WinBUGS. , 2010, Journal of statistical software.

[16]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[17]  Hall D.B.,et al.  Advanced Linear Modeling (2nd ed.) , 2003 .

[18]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[19]  Daniel B. Hall,et al.  Advanced Linear Modeling , 2003 .

[20]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[21]  Kari Karhunen,et al.  Über lineare Methoden in der Wahrscheinlichkeitsrechnung , 1947 .

[22]  N J Douglas,et al.  The effect of nonvisible sleep fragmentation on daytime function. , 1997, American journal of respiratory and critical care medicine.

[23]  J. Pekar,et al.  A method for making group inferences from functional MRI data using independent component analysis , 2001, Human brain mapping.

[24]  Vince D. Calhoun,et al.  A review of group ICA for fMRI data and ICA for joint inference of imaging, genetic, and ERP data , 2009, NeuroImage.

[25]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[26]  Ana-Maria Staicu,et al.  Fast methods for spatially correlated multilevel functional data. , 2010, Biostatistics.

[27]  R. Christensen,et al.  Fisher Lecture: Dimension Reduction in Regression , 2007, 0708.3774.

[28]  Ana-Maria Staicu,et al.  Generalized Multilevel Functional Regression , 2009, Journal of the American Statistical Association.

[29]  B. Caffo,et al.  MULTILEVEL FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS. , 2009, The annals of applied statistics.