论文信息 - Performance Evaluation of Deep Learning Algorithms in Biomedical Document Classification

Performance Evaluation of Deep Learning Algorithms in Biomedical Document Classification

Document classification is a prevalent task in Natural Language Processing (NLP), which has an extensive range of applications in the biomedical domains such as biomedical literature indexing, automatic diagnosis codes assignment, tweets classification for public health topics, and patient safety reports classification. Nevertheless, manual classification of biomedical articles published every year into specific predefined categories becomes a cumbersome task. Hence, building an automatic document classification for biomedical databases emerges as a significant task among the scientific community. In recent years, Deep Learning (DL) models like Deep Neural Networks (DNN), Convolution Neural Networks (CNN), Recurrent Neural Networks (RNN), and Ensemble Deep Learning models are widely used in the area of text document classification for better classification performance compared to Machine Learning (ML) algorithms. The major advantage of using DL models in document classification is that it provides rich semantic and grammatical information for document representation through pre-trained word embedding. Hence, this paper investigates the deployment of the various state-of-the-art DL based classification models in automatic classification of benchmark biomedical datasets. Finally, the performance of all the aforementioned constitutional classifiers is compared and evaluated through the well-defined performance evaluation metrics such as accuracy, precision, recall, and f1measure.

[1] Danilo P. Mandic,et al. Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability , 2001 .

[2] Donald E. Brown,et al. RMDL: Random Multimodel Deep Learning for Classification , 2018, ICISDM '18.

[3] Erik Cambria,et al. Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[4] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5] Jun Zhao,et al. Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[6] Duong B. Nguyen,et al. Biomedical Text Classification with Improved Feature Weighting Method , 2016 .

[7] Stephen E. Robertson,et al. Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.

[8] Ramakanth Kavuluru,et al. Convolutional neural networks for biomedical text classification: application in indexing biomedical articles , 2015, BCB.

[9] Yan Yan,et al. Biomedical literature classification with a CNNs-based hybrid learning network , 2018, PloS one.

[10] Donald E. Brown,et al. Text Classification Algorithms: A Survey , 2019, Inf..

[11] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.

[13] A. Asuncion,et al. UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[14] Jianping Li,et al. Deeplearning Model Used in Text Classification , 2018, 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP).

[15] Luis Anido Rifón,et al. Biomedical literature classification using encyclopedic knowledge: a Wikipedia-based bag-of-concepts approach , 2015 .

[16] L. Anido Rifón,et al. Biomedical literature classification using encyclopedic knowledge: a Wikipedia-based bag-of-concepts approach , 2015, PeerJ.

[17] Jürgen Schmidhuber,et al. Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[18] Guy Lapalme,et al. A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[19] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.

[21] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.