Adaptive Dimensionality Reduction with Semi-Supervision (AdDReSS): Classifying Multi-Attribute Biomedical Data

Medical diagnostics is often a multi-attribute problem, necessitating sophisticated tools for analyzing high-dimensional biomedical data. Mining this data often results in two crucial bottlenecks: 1) high dimensionality of features used to represent rich biological data and 2) small amounts of labelled training data due to the expense of consulting highly specific medical expertise necessary to assess each study. Currently, no approach that we are aware of has attempted to use active learning in the context of dimensionality reduction approaches for improving the construction of low dimensional representations. We present our novel methodology, AdDReSS (Adaptive Dimensionality Reduction with Semi-Supervision), to demonstrate that fewer labeled instances identified via AL in embedding space are needed for creating a more discriminative embedding representation compared to randomly selected instances. We tested our methodology on a wide variety of domains ranging from prostate gene expression, ovarian proteomic spectra, brain magnetic resonance imaging, and breast histopathology. Across these various high dimensional biomedical datasets with 100+ observations each and all parameters considered, the median classification accuracy across all experiments showed AdDReSS (88.7%) to outperform SSAGE, a SSDR method using random sampling (85.5%), and Graph Embedding (81.5%). Furthermore, we found that embeddings generated via AdDReSS achieved a mean 35.95% improvement in Raghavan efficiency, a measure of learning rate, over SSAGE. Our results demonstrate the value of AdDReSS to provide low dimensional representations of high dimensional biomedical data while achieving higher classification rates with fewer labelled examples as compared to without active learning.

[1]  Dong Xu,et al.  Semi-Supervised Dimension Reduction Using Trace Ratio Criterion , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Nikos A. Vlassis,et al.  Gaussian fields for semi-supervised regression and correspondence learning , 2006, Pattern Recognit..

[3]  Haitao Zhao Combining labeled and unlabeled data with graph embedding , 2006, Neurocomputing.

[4]  Ian Davidson,et al.  Semi-Supervised Dimension Reduction for Multi-Label Classification , 2010, AAAI.

[5]  George Lee,et al.  Computer-aided prognosis: Predicting patient and disease outcome via quantitative fusion of multi-scale, multi-modal data , 2011, Comput. Medical Imaging Graph..

[6]  Maria S. Kulikova,et al.  Mitosis detection in breast cancer histological images An ICPR 2012 contest , 2013, Journal of pathology informatics.

[7]  Jianbo Shi,et al.  Graph Embedding to Improve Supervised Classification and Novel Class Detection: Application to Prostate Cancer , 2005, MICCAI.

[8]  Jens Nilsson,et al.  Approximate geodesic distances reveal biologically relevant structures in microarray data , 2004, Bioinform..

[9]  Pierre Geurts,et al.  Proteomic mass spectra classification using decision tree based ensemble methods , 2005, Bioinform..

[10]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[11]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[12]  Shinichi Nakajima,et al.  Semi-supervised local Fisher discriminant analysis for dimensionality reduction , 2009, Machine Learning.

[13]  Feiping Nie,et al.  Semi-supervised Dimensionality Reduction via Harmonic Functions , 2011, MDAI.

[14]  George Lee,et al.  Semi-Supervised Graph Embedding Scheme with Active Learning (SSGEAL): Classifying High Dimensional Biomedical Data , 2010, PRIB.

[15]  Dinggang Shen,et al.  Morphological classification of brains via high-dimensional shape transformations and machine learning methods , 2004, NeuroImage.

[16]  Kevin Dawson,et al.  Sample phenotype clusters in high-density oligonucleotide microarray data sets are revealed using Isomap, a nonlinear algorithm , 2005, BMC Bioinformatics.

[17]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[20]  A. Madabhushi,et al.  Investigating the Efficacy of Nonlinear Dimensionality Reduction Schemes in Classifying Gene and Protein Expression Studies , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  A. Evans,et al.  MRI simulation-based evaluation of image-processing and classification methods , 1999, IEEE Transactions on Medical Imaging.

[22]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Hua Xu,et al.  Applying active learning to assertion classification of concepts in clinical text , 2012, J. Biomed. Informatics.

[24]  David C. Hoyle,et al.  Automatic PCA Dimension Selection for High Dimensional Data and Small Sample Sizes , 2008 .

[25]  Chun Chen,et al.  Active Learning Based on Locally Linear Reconstruction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  E. Petricoin,et al.  Use of proteomic patterns in serum to identify ovarian cancer , 2002, The Lancet.

[27]  Tommy W. S. Chow,et al.  Trace ratio criterion based generalized discriminative learning for semi-supervised dimensionality reduction , 2012, Pattern Recognit..

[28]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[29]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[30]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[31]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[32]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[33]  Daoqiang Zhang,et al.  Semi-Supervised Dimensionality Reduction ∗ , 2007 .

[34]  Bahram Parvin,et al.  NUCLEAR SEGMENTATION IN H & E SECTIONS VIA MULTI-REFERENCE GRAPH CUT ( MRGC ) , 2011 .

[35]  E Le Rumeur,et al.  MRI texture analysis on texture test objects, normal brain and intracranial tumors. , 2003, Magnetic resonance imaging.

[36]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[37]  Bahram Parvin,et al.  Batch-invariant nuclear segmentation in whole mount histology sections , 2012, 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI).

[38]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[39]  Yi Yang,et al.  Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Daoqiang Zhang,et al.  Pairwise Constraint-Guided Sparse Learning for Feature Selection , 2016, IEEE Transactions on Cybernetics.

[41]  Pablo Arbeláez,et al.  A discriminant multi-scale histopathology descriptor using dictionary learning , 2014, Medical Imaging.

[42]  Anant Madabhushi,et al.  An active learning based classification strategy for the minority class problem: application to histopathology annotation , 2011, BMC Bioinformatics.

[43]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[44]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[45]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[46]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[47]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[48]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[49]  Ying Liu,et al.  Active Learning with Support Vector Machine Applied to Gene Expression Data for Cancer Classification , 2004, J. Chem. Inf. Model..

[50]  Zhenhua Guo,et al.  A Framework of Joint Graph Embedding and Sparse Regression for Dimensionality Reduction , 2015, IEEE Transactions on Image Processing.

[51]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[52]  Gunnar Rätsch,et al.  Kernel PCA pattern reconstruction via approximate pre-images. , 1998 .

[53]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[54]  Sanjoy Dasgupta,et al.  Adaptive Control Processes , 2010, Encyclopedia of Machine Learning and Data Mining.

[55]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[56]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[57]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[58]  George Lee,et al.  Supervised Regularized Canonical Correlation Analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery , 2011, BMC Bioinformatics.

[59]  Xin Yang,et al.  Semi-supervised nonlinear dimensionality reduction , 2006, ICML.

[60]  Jarkko Venna,et al.  Local multidimensional scaling , 2006, Neural Networks.