One-class classification for highly imbalanced medical image data

Computer-aided diagnosis plays an important role in clinical image diagnosis. Current clinical image classification tasks usually focus on binary classification, which need to collect samples for both the positive and negative classes in order to train a binary classifier. However, in many clinical scenarios, there may have many more samples in one class than in the other class, which results in the problem of data imbalance. Data imbalance is a severe problem that can substantially influence the performance of binary-class machine learning models. To address this issue, one-class classification, which focuses on learning features from the samples of one given class, has been proposed. In this work, we assess the one-class support vector machine (OCSVM) to solve the classification tasks on two highly imbalanced datasets, namely, space-occupying kidney lesions (including renal cell carcinoma and benign) data and breast cancer distant metastasis/non-metastasis imaging data. Experimental results show that the OCSVM exhibits promising performance compared to binary-class and other one-class classification methods.

[1]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[2]  Guizhi Xu,et al.  Tumor Detection in MR Images Using One-Class Immune Feature Weighted SVMs , 2011, IEEE Transactions on Magnetics.

[3]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[4]  Elliot K Fishman,et al.  CT texture analysis of renal masses: pilot study using random forest classification for prediction of pathology. , 2014, Academic radiology.

[5]  Arpit Singh,et al.  A Survey on Methods for Solving Data Imbalance Problem for Classification , 2015 .

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[8]  Nicola Schieda,et al.  Can Quantitative CT Texture Analysis be Used to Differentiate Fat-poor Renal Angiomyolipoma from Renal Cell Carcinoma on Unenhanced CT Images? , 2015, Radiology.

[9]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[10]  Jian Sun,et al.  Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Stephan Dreiseitl,et al.  Outlier Detection with One-Class SVMs: An Application to Melanoma Prognosis. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[12]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[13]  Yu Cheng,et al.  Deep Structured Energy Based Models for Anomaly Detection , 2016, ICML.

[14]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.