Inferring Disease Status by Non-parametric Probabilistic Embedding

Computing similarity between all pairs of patients in a dataset enables us to group the subjects into disease subtypes and infer their disease status. However, robust and efficient computation of pairwise similarity is a challenging task for large-scale medical image datasets. We specifically target diseases where multiple subtypes of pathology present simultaneously, rendering the definition of the similarity a difficult task. To define pairwise patient similarity, we characterize each subject by a probability distribution that generates its local image descriptors. We adopt a notion of affinity between probability distributions which lends itself to similarity between subjects. Instead of approximating the distributions by a parametric family, we propose to compute the affinity measure indirectly using an approximate nearest neighbor estimator. Computing pairwise similarities enables us to embed the entire patient population into a lower dimensional manifold, mapping each subject from high-dimensional image space to an informative low-dimensional representation. We validate our method on a large-scale lung CT scan study and demonstrate the state-of-the-art prediction on an important physiologic measure of airflow (the forced expiratory volume in one second, FEV1) in addition to a 5-category clinical rating (so-called GOLD score).

[1]  B. Schölkopf,et al.  MLLE: Modified Locally Linear Embedding Using Multiple Weights , 2007 .

[2]  Christos Davatzikos,et al.  GRAM: A framework for geodesic registration on anatomical manifolds , 2010, Medical Image Anal..

[3]  Uwe Krüger,et al.  Canonical Correlation Analysis based on Hilbert-Schmidt Independence Criterion and Centered Kernel Target Alignment , 2013, ICML.

[4]  Maya R. Gupta,et al.  Similarity-based Classification: Concepts and Algorithms , 2009, J. Mach. Learn. Res..

[5]  E. Regan,et al.  Genetic Epidemiology of COPD (COPDGene) Study Design , 2011, COPD.

[6]  Georg Langs,et al.  Longitudinal Alignment of Disease Progression in Fibrosing Interstitial Lung Disease , 2014, MICCAI.

[7]  William M. Wells,et al.  A Feature-Based Approach to Big Data Analysis of Medical Images , 2015, IPMI.

[8]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Carla G. Wilson,et al.  Relationships between airflow obstruction and quantitative CT measurements of emphysema, air trapping, and airways in subjects with and without chronic obstructive pulmonary disease. , 2013, AJR. American journal of roentgenology.

[10]  Ross T. Whitaker,et al.  On the Manifold Structure of the Space of Brain Images , 2009, MICCAI.

[11]  Raúl San José Estépar,et al.  Distinct quantitative computed tomography emphysema patterns are associated with physiology and function in smokers. , 2013, American journal of respiratory and critical care medicine.

[12]  Qing Wang,et al.  Divergence Estimation for Multidimensional Densities Via $k$-Nearest-Neighbor Distances , 2009, IEEE Transactions on Information Theory.

[13]  David A Lynch Progress in Imaging COPD, 2004 - 2014. , 2014, Chronic obstructive pulmonary diseases.

[14]  David A Lynch,et al.  Quantitative Computed Tomography in Chronic Obstructive Pulmonary Disease , 2013, Journal of thoracic imaging.

[15]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[16]  Lauge Sørensen,et al.  Texture-Based Analysis of COPD: A Data-Driven Approach , 2012, IEEE Transactions on Medical Imaging.

[17]  Polina Golland,et al.  Generative Method to Discover Genetically Driven Image Biomarkers , 2015, IPMI.

[18]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Shu-Yi Liao,et al.  Physical Activity Monitoring in Patients with Chronic Obstructive Pulmonary Disease. , 2014, Chronic obstructive pulmonary diseases.

[20]  Markus Holzer,et al.  Over-Segmentation of 3D Medical Image Volumes based on Monogenic Cues , 2014 .

[21]  Kun Liu,et al.  Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates , 2014, International Journal of Computer Vision.