Biomedical literature classification with a CNNs-based hybrid learning network

Deep learning techniques, e.g., Convolutional Neural Networks (CNNs), have been explosively applied to the research in the fields of information retrieval and natural language processing. However, few research efforts have addressed semantic indexing with deep learning. The use of semantic indexing in the biomedical literature has been limited for several reasons. For instance, MEDLINE citations contain a large number of semantic labels from automatically annotated MeSH terms, and for a great deal of the literature, only the information of the title and the abstract is readily available. In this paper, we propose a Boltzmann Convolutional neural network framework (B-CNN) for biomedicine semantic indexing. In our hybrid learning framework, the CNN can adaptively deal with features of documents that have sequence relationships, and can capture context information accordingly; the Deep Boltzmann Machine (DBM) merges global (the entity in each document) and local information through its training with undirected connections. Additionally, we have designed a hierarchical coarse to fine style indexing structure for learning and classifying documents, and a novel feature extension approach with word sequence embedding and Wikipedia categorization. Comparative experiments were conducted for semantic indexing of biomedical abstract documents; these experiments verified the encouraged performance of our B-CNN model.

[1]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[2]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[3]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jun Du,et al.  Deep neural network based speech separation for robust speech recognition , 2014, 2014 12th International Conference on Signal Processing (ICSP).

[7]  Sutanu Chakraborti,et al.  Document classification by topic labeling , 2013, SIGIR.

[8]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[9]  Yiming Yang,et al.  Deep Learning for Extreme Multi-label Text Classification , 2017, SIGIR.

[10]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Xiaohua Hu,et al.  Exploiting Wikipedia as external knowledge for document clustering , 2009, KDD.

[12]  Shih-Chia Huang,et al.  Highly Accurate Moving Object Detection in Variable Bit Rate Video-Based Traffic Monitoring Systems , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[14]  HuangShih-Chia An Advanced Motion Detection Algorithm With Video Quality Analysis for Video Surveillance Systems , 2011 .

[15]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[16]  Meng Wang,et al.  Disease Inference from Health-Related Questions via Sparse Deep Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[17]  Klaus-Robert Müller,et al.  Deep Boltzmann Machines and the Centering Trick , 2012, Neural Networks: Tricks of the Trade.

[18]  Tao Li,et al.  A Joint Local-Global Approach for Medical Terminology Assignment , 2014, MedIR@SIGIR.

[19]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[20]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[21]  Qiang Chen,et al.  Deep Belief Networks and Biomedical Text Categorisation , 2014, ALTA.

[22]  Yanjun Qi,et al.  Supervised semantic indexing , 2009, ECIR.

[23]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[24]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[25]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[26]  Fernando Pérez-Cruz,et al.  Deep Learning for Multi-label Classification , 2014, ArXiv.

[27]  Chanho Jung,et al.  A Unified Spectral-Domain Approach for Saliency Detection and Its Application to Automatic Object Segmentation , 2012, IEEE Transactions on Image Processing.

[28]  Hui Jiang,et al.  Combining information from multi-stream features using deep neural network in speech recognition , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[29]  Loong Fah Cheong,et al.  Active Visual Segmentation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Hao Wu,et al.  Deep Semantic Embedding , 2014, SMIR@SIGIR.

[31]  Geoffrey Zweig,et al.  Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[32]  Koichi Shinoda,et al.  n-gram Models for Video Semantic Indexing , 2014, ACM Multimedia.

[33]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[34]  Walter J. Jessen,et al.  Mining PubMed for biomarker-disease associations to guide discovery , 2012 .

[35]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[36]  Anna Podlesnaya,et al.  Deep Learning Based Semantic Video Indexing and Retrieval , 2016, IntelliSys.

[37]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[38]  Marc'Aurelio Ranzato,et al.  Dynamic auto-encoders for semantic indexing , 2010 .

[39]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[40]  Quoc V. Le,et al.  Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[43]  Xu-Cheng Yin,et al.  Text Detection, Tracking and Recognition in Video: A Comprehensive Survey , 2016, IEEE Transactions on Image Processing.

[44]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[45]  Paul Cullinan,et al.  Pubmed mining for occupational idiopathic pulmonary fibrosis papers , 2017 .

[46]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[47]  Peter I. Corke,et al.  Modelling local deep convolutional neural network features to improve fine-grained image classification , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[48]  Andreas Kanavos,et al.  Tensor-based document retrieval over Neo4j with an application to PubMed mining , 2016, 2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA).

[49]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[50]  Burr Settles,et al.  Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets , 2004, NLPBA/BioNLP.

[51]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[52]  Wessel Kraaij,et al.  MeSH Up: effective MeSH text classification for improved document retrieval , 2009, Bioinform..

[53]  Peter I. Cowling,et al.  MMAC: a new multi-class, multi-label associative classification approach , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[54]  Antonio Jimeno-Yepes,et al.  MEDLINE MeSH indexing: lessons learned from machine learning and future directions , 2012, IHI '12.

[55]  Alan R. Aronson,et al.  Application of a Medical Text Indexer to an Online Dermatology Atlas , 2004, MedInfo.

[56]  Olivier Bodenreider,et al.  From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches , 2007, BioNLP@ACL.

[57]  Hugo Larochelle,et al.  Efficient Learning of Deep Boltzmann Machines , 2010, AISTATS.

[58]  Koichi Shinoda,et al.  Vocabulary Expansion Using Word Vectors for Video Semantic Indexing , 2015, ACM Multimedia.

[59]  Huimin Lu,et al.  Underwater image de-scattering and classification by deep neural network , 2016, Comput. Electr. Eng..

[60]  Yongqiang Wang,et al.  Small-footprint high-performance deep neural network-based speech recognition using split-VQ , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[61]  Shengping Zhang,et al.  Action recognition based on overcomplete independent components analysis , 2014, Inf. Sci..

[62]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[63]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[64]  Tao Li,et al.  WenZher: comprehensive vertical search for healthcare domain , 2014, SIGIR.

[65]  Bowen Zhou,et al.  Applying deep learning to answer selection: A study and an open task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[66]  Jun Zhang,et al.  Multi-Orientation Scene Text Detection with Adaptive Clustering , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[68]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[69]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[70]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[71]  Amir Karami,et al.  FFTM: A Fuzzy Feature Transformation Method for Medical Documents , 2014, BioNLP@ACL.

[72]  Yi Han,et al.  Attention-based encoder-decoder model for answer selection in question answering , 2017, Frontiers of Information Technology & Electronic Engineering.

[73]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[74]  Shih-Chia Huang,et al.  An Advanced Motion Detection Algorithm With Video Quality Analysis for Video Surveillance Systems , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[75]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[76]  Jie Zhou,et al.  The research on gene-disease association based on text-mining of PubMed , 2018, BMC Bioinformatics.

[77]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[78]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[79]  Carol Friedman,et al.  Introduction: named entity recognition in biomedicine , 2004, J. Biomed. Informatics.

[80]  Antonio Jimeno-Yepes,et al.  Comparison and combination of several MeSH indexing approaches , 2013, AMIA.

[81]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[82]  Kaizhu Huang,et al.  Robust Text Detection in Natural Scene Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.