Robust analysis of MRS brain tumour data using t-GTM

This paper proposes a principled, self-organized, framework to manage two sources of uncertainty that are inherent in intelligent systems for medical decision support, namely outliers and missing data. The framework is applied to magnetic resonance spectra (MRS), which are indicators of the grade of malignancy in brain tumours. A model for multivariate data clustering and visualization, the generative topographic mapping (GTM), is re-formulated as a mixture of Student's t-distributions making it more robust to outliers while supporting the imputation of missing values. An important new development is the extension of the model to provide automatic feature relevance determination. Its effectiveness on the MRS data is demonstrated empirically.

[1]  Richard Baumgartner,et al.  Mapping high-dimensional data onto a relative distance plane - an exact method for visualizing and characterizing high-dimensional patterns , 2004, J. Biomed. Informatics.

[2]  Michel Verleysen,et al.  Representation of functional data in neural networks , 2005, Neurocomputing.

[3]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[4]  Fabrice Rossi,et al.  Clustering functional data with the SOM algorithm , 2004, ESANN.

[5]  D. Louis Collins,et al.  Accurate, noninvasive diagnosis of human brain tumors by using proton magnetic resonance spectroscopy , 1996, Nature Medicine.

[6]  Geoffrey E. Hinton,et al.  GTM through time , 1997 .

[7]  Paulo J. G. Lisboa,et al.  Outstanding Issues for Clinical Decision Support with Neural Networks , 2000, ANNIMAB.

[8]  Peter J. F. Lucas,et al.  Model-based diagnosis in medicine , 1997, Artif. Intell. Medicine.

[9]  Peter Tiño,et al.  Hierarchical GTM: Constructing Localized Nonlinear Projection Manifolds in a Principled Way , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Mark A. Girolami Latent variable models for the topographic organisation of discrete and strictly positive data , 2002, Neurocomputing.

[11]  Christopher M. Bishop,et al.  Developments of the generative topographic mapping , 1998, Neurocomputing.

[12]  Miguel Á. Carreira-Perpiñán,et al.  Reconstruction of Sequential Data with Probabilistic Models and Continuity Constraints , 1999, NIPS.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Franklyn A Howe,et al.  1H MR spectroscopy of brain tumours and masses , 2003, NMR in biomedicine.

[15]  C. Abraham,et al.  Unsupervised Curve Clustering using B‐Splines , 2003 .

[16]  Paulo J. G. Lisboa,et al.  Robust methodology for the discrimination of brain tumours from in vivo magnetic resonance spectra , 2000 .

[17]  Alan Olinsky,et al.  The comparative efficacy of imputation methods for missing data in structural equation modeling , 2003, Eur. J. Oper. Res..

[18]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[19]  Harri Niska,et al.  Methods for imputation of missing values in air quality data sets , 2004 .

[20]  Paulo J. G. Lisboa,et al.  Selective smoothing of the generative topographic mapping , 2003, IEEE Trans. Neural Networks.

[21]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[22]  Christopher M. Bishop,et al.  Robust Bayesian Mixture Modelling , 2005, ESANN.

[23]  Reto Meuli,et al.  Robust parameter estimation of intensity distributions for brain magnetic resonance images , 1998, IEEE Transactions on Medical Imaging.

[24]  Ata Kabán,et al.  Finding Uninformative Features in Binary Data , 2005, IDEAL.

[25]  Ignasi Rodríguez-Roda,et al.  Exploration Of The Ecological Status OfMediterranean Rivers: Clustering,Visualizing And Reconstructing Streams DataUsing Generative Topographic Mapping , 2004 .

[26]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[27]  Michael I. Jordan,et al.  Learning from Incomplete Data , 1994 .

[28]  Guido Gerig,et al.  A brain tumor segmentation framework based on outlier detection , 2004, Medical Image Anal..

[29]  Alfredo Vellido Alcacena Preliminary theoretical results on a feature relevance determination method for Generative Topographic Mapping , 2005 .

[30]  Paulo J. G. Lisboa,et al.  A review of evidence of health benefit from artificial neural networks in medical intervention , 2002, Neural Networks.

[31]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[32]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Panos M. Pardalos,et al.  Handbook of Massive Data Sets , 2002, Massive Computing.

[34]  Geoffrey J. McLachlan,et al.  Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[35]  Alfredo Vellido,et al.  Comparative Assessment of the Robustness of Missing Data Imputation Through Generative Topographic Mapping , 2005, IWANN.

[36]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[37]  W El-Deredy,et al.  Tumour grading from magnetic resonance spectroscopy: a comparison of feature extraction with variable selection , 2003, Statistics in medicine.

[38]  W. Baxt Application of artificial neural networks to clinical medicine , 1995, The Lancet.

[39]  Bin Luo,et al.  Robust mixture modelling using multivariate , 2004, Pattern Recognit. Lett..

[40]  Michel Verleysen,et al.  Flexible and Robust Bayesian Classification by Finite Mixture Models , 2004, ESANN.