A benchmark dataset and case study for Chinese medical question intent classification

Background To provide satisfying answers, medical QA system has to understand the intentions of the users’ questions precisely. For medical intent classification, it requires high-quality datasets to train a deep-learning approach in a supervised way. Currently, there is no public dataset for Chinese medical intent classification, and the datasets of other fields are not applicable to the medical QA system. To solve this problem, we construct a Chinese medical intent dataset (CMID) using the questions from medical QA websites. On this basis, we compare four intent classification models on CMID using a case study. Methods The questions in CMID are obtained from several medical QA websites. The intent annotation standard is developed by the medical experts, which includes four types and 36 subtypes of users’ intents. Besides the intent label, CMID also provides two types of additional information, including word segmentation and named entity. We use the crowdsourcing way to annotate the intent information for each Chinese medical question. Word segmentation and named entities are obtained using the Jieba and a well-trained Lattice-LSTM model. We loaded a Chinese medical dictionary consisting of 530,000 for word segmentation to obtain a more accurate result. We also select four popular deep learning-based models and compare their performances of intent classification on CMID. Results The final CMID contains 12,000 Chinese medical questions and is organized in JSON format. Each question is labeled the intention, word segmentation, and named entity information. The information about question length, number of entities, and are also detailed analyzed. Among Fast Text, TextCNN, TextRNN, and TextGCN, Fast Text and TextCNN models have achieved the best results in four types and 36 subtypes intent classification, respectively. Conclusions In this work, we provide a dataset for Chinese medical intent classification, which can be used in medical QA and related fields. We performed an intent classification task on the CMID. In addition, we also did some analysis on the content of the dataset.

[1]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Charles L. A. Clarke,et al.  Information Retrieval - Implementing and Evaluating Search Engines , 2010 .

[4]  Stefan Feuerriegel,et al.  RankQA: Neural Question Answering with Answer Re-Ranking , 2019, ACL.

[5]  Masaaki Nagata,et al.  Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction , 2019, ACL.

[6]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[7]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[8]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[9]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[10]  Ben He,et al.  Question-answer topic model for question retrieval in community question answering , 2012, CIKM.

[11]  Gang Wang,et al.  Understanding user's query intent with wikipedia , 2009, WWW '09.

[12]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[13]  Ming Zhou,et al.  S-Net: From Answer Extraction to Answer Generation for Machine Reading Comprehension , 2017, AAAI 2017.

[14]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[15]  Jing Li,et al.  Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings , 2018, NAACL.

[16]  Fabien L. Gandon,et al.  QAKiS: an Open Domain QA System based on Relational Patterns , 2012, SEMWEB.

[17]  Yue Gao,et al.  Beyond Text QA: Multimedia Answer Generation by Harvesting Web Information , 2013, IEEE Transactions on Multimedia.

[18]  Nan Yang,et al.  I Know There Is No Answer: Modeling Answer Validation for Machine Reading Comprehension , 2018, NLPCC.

[19]  Juan Zamora,et al.  Building Decision Trees to Identify the Intent of a User Query , 2009, KES.

[20]  B. S. P. Mishra,et al.  Query Intent Classification using Semi- Supervised Learning , 2014 .

[21]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.