Accelerated training of bootstrap aggregation-based deep information extraction systems from cancer pathology reports

OBJECTIVE In machine learning, it is evident that the classification of the task performance increases if bootstrap aggregation (bagging) is applied. However, the bagging of deep neural networks takes tremendous amounts of computational resources and training time. The research question that we aimed to answer in this research is whether we could achieve higher task performance scores and accelerate the training by dividing a problem into sub-problems. MATERIALS AND METHODS The data used in this study consist of free text from electronic cancer pathology reports. We applied bagging and partitioned data training using Multi-Task Convolutional Neural Network (MT-CNN) and Multi-Task Hierarchical Convolutional Attention Network (MT-HCAN) classifiers. We split a big problem into 20 sub-problems, resampled the training cases 2,000 times, and trained the deep learning model for each bootstrap sample and each sub-problem-thus, generating up to 40,000 models. We performed the training of many models concurrently in a high-performance computing environment at Oak Ridge National Laboratory (ORNL). RESULTS We demonstrated that aggregation of the models improves task performance compared with the single-model approach, which is consistent with other research studies; and we demonstrated that the two proposed partitioned bagging methods achieved higher classification accuracy scores on four tasks. Notably, the improvements were significant for the extraction of cancer histology data, which had more than 500 class labels in the task; these results show that data partition may alleviate the complexity of the task. On the contrary, the methods did not achieve superior scores for the tasks of site and subsite classification. Intrinsically, since data partitioning was based on the primary cancer site, the accuracy dependened on the determination of the partitions, which needs further investigation and improvement. CONCLUSION Results in this research demonstrate that 1. The data partitioning and bagging strategy achieved higher performance scores. 2. We achieved faster training leveraged by the high-performance Summit supercomputer at ORNL.

[1]  Hong-Jun Yoon,et al.  Model-based Hyperparameter Optimization of Convolutional Neural Networks for Information Extraction from Cancer Pathology Reports on HPC , 2019, 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[2]  John X. Qiu,et al.  Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks , 2019, J. Am. Medical Informatics Assoc..

[3]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[4]  Shang Gao,et al.  Classifying cancer pathology reports with hierarchical self-attention networks , 2019, Artif. Intell. Medicine.

[5]  Jianping Li,et al.  A deep learning ensemble approach for crude oil price forecasting , 2017 .

[6]  Ying Liu,et al.  Acute Lymphoblastic Leukemia Cells Image Analysis with Deep Bagging Ensemble Learning , 2019 .

[7]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[8]  Michael Cogswell,et al.  Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks , 2015, ArXiv.

[9]  Kristin P. Bennett,et al.  Bagging neural network sensitivity analysis for feature reduction for in-silico drug design , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[10]  Oladimeji Farri,et al.  Clinical Natural Language Processing with Deep Learning , 2019, Data Science for Healthcare.

[11]  Fei Zou,et al.  Bagging and deep learning in optimal individualized treatment rules , 2019, Biometrics.

[12]  Usman Qamar,et al.  BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting , 2015, Australasian Physical & Engineering Sciences in Medicine.

[13]  Kexin Huang,et al.  Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation , 2019, CLINICALNLP.

[14]  Florian Yger,et al.  Recognizing Art Style Automatically in Painting with Deep Learning , 2017, ACML.

[15]  Oscar Déniz-Suárez,et al.  Bagging Tree Classifier and Texture Features for Tumor Identification in Histological Images , 2016, MIUA.

[16]  Kuan-Ta Chen,et al.  Automation of the kidney function prediction and classification through ultrasound-based kidney imaging using deep learning , 2019, npj Digital Medicine.

[17]  Yanchun Zhang,et al.  Epileptic seizure detection in EEG signals using tunable-Q factor wavelet transform and bootstrap aggregating , 2016, Comput. Methods Programs Biomed..

[18]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[19]  Fernanda Polubriaginof,et al.  The feasibility of using natural language processing to extract clinical information from breast pathology reports , 2012, Journal of pathology informatics.

[20]  Kevin A. Schneider,et al.  Classification of Histopathological Biopsy Images Using Ensemble of Deep Learning Networks , 2019, CASCON.

[21]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[22]  Shigeyuki Hamori,et al.  Ensemble Learning or Deep Learning? Application to Default Risk Analysis , 2018 .

[23]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[24]  Regina Barzilay,et al.  Using machine learning to parse breast pathology reports , 2016, bioRxiv.

[25]  Ruoyu Du,et al.  Optimal Feature Selection and Deep Learning Ensembles Method for Emotion Recognition From Human Brain EEG Sensors , 2017, IEEE Access.

[26]  Hanxi Li,et al.  Convolutional neural net bagging for online visual tracking , 2016, Comput. Vis. Image Underst..

[27]  Jean-Philippe Vert,et al.  A bagging SVM to learn from positive and unlabeled examples , 2010, Pattern Recognit. Lett..

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  Wenge Rong,et al.  Auto-encoder based bagging architecture for sentiment analysis , 2014, J. Vis. Lang. Comput..

[30]  Hong-Jun Yoon,et al.  Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports , 2018, IEEE Journal of Biomedical and Health Informatics.

[31]  Abdulhamit Subasi,et al.  Human activity recognition using machine learning methods in a smart healthcare environment , 2020 .

[32]  Hua Xu,et al.  A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries , 2012, AMIA.

[33]  Wei Li,et al.  Prostate cancer diagnosis using deep learning with 3D multiparametric MRI , 2017, Medical Imaging.

[34]  Özlem Uzuner,et al.  Editorial: The second international workshop on health natural language processing (HealthNLP 2019) , 2019, BMC Medical Informatics and Decision Making.