论文信息 - Training classifiers with limited data using the Radon cumulative distribution transform

Training classifiers with limited data using the Radon cumulative distribution transform

Our purpose in this study is to investigate whether a recently introduced image transform, denoted as the Radon cumulative distribution transform (R-CDT), can be used as a viable preprocessing step for augmenting the robustness of training end-to-end systems with fewer training samples. In order to assess the ability of the R-CDT to perform this aim, we identified a standard machine learning dataset, MNIST, and a preliminary dataset comprised of liver cell nuclei images derived from one of two tissue types: benign or malignant tumor lesions. We separated the data into training and testing sets with 20% of the total data used for testing across all training set size conditions. To simulate a range of limited size of training examples, we randomly generated data subsets ranging in size from 80% to 0.8% of the total dataset size to be used for training. Linear classification algorithms were implemented via logistic regression and a support vector machine model with a linear kernel on both the raw images and images transformed via the R-CDT. Additionaly, non-linear classification accuracies were assessed via comparing the R-CDT paired with a shallow CNN and using a deep CNN to classify images. Results indicate that classification in Radon cumulative distribution transform space outperforms classification in image space in conditions of limited data, as one is likely to see in medical imaging.

Gustavo K. Rohde | Liam Cattell | Cailey E. Fitzgerald

[1] Gustavo K. Rohde,et al. The Radon Cumulative Distribution Transform and Its Application to Image Classification , 2015, IEEE Transactions on Image Processing.

[2] Timothy Doster,et al. De-multiplexing vortex modes in optical communications using transport-based pattern recognition. , 2018, Optics express.

[3] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6] Chloe Hutton,et al. Classification of amyloid status using machine learning with histograms of oriented 3D gradients , 2016, NeuroImage: Clinical.

[7] Gustavo K. Rohde,et al. The Cumulative Distribution Transform and Linear Pattern Classification , 2015, Applied and Computational Harmonic Analysis.

[8] Gustavo K. Rohde,et al. A graph-based method for detecting characteristic phenotypes from biomedical images , 2010, 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.