论文信息 - A Method of Annotating Disease Names in TCM Patents Based on Co-training

A Method of Annotating Disease Names in TCM Patents Based on Co-training

In the era of big data, annotated text data is a scarce resource. The annotated important semantic information can be used as keywords in text analysis, mining and intelligent retrieval, as well as valuable training and testing sets for machine learning. In the analysis, mining and intelligent retrieval of Traditional Chinese Medicine (TCM) patents, similar to Chinese herbal medicine name and medicine efficacy, disease name is also an important annotation object. Utilizing the characteristics of TCM patent texts and based on co-training method in machine learning, this paper proposes a method of annotating disease names from TCM patent texts. Experiments show that this method is feasible and effective. This method can also be extended to annotate other semantic information in TCM patents.

Na Deng | Caiquan Xiong | Xu Chen

[1] Xu Chen,et al. A Semi-Automatic Annotation Method of Effect Clue Words for Chinese Patents Based on Co-Training , 2018, Int. J. Data Warehous. Min..

[2] Xu Chen,et al. A Semi-Supervised Machine Learning Method for Chinese Patent Effect Annotation , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[3] Lanfen Lin,et al. An Ontology-Based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design , 2013 .

[4] Yan Li,et al. An Automatic Information Extraction Method Based on the Characteristics of Patent , 2012 .

[5] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[6] Xu Chen,et al. A co-training based method for chinese patent semantic annotation , 2012, CIKM '12.

[7] Chen Xu,et al. A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction , 2011, 2011 Eighth Web Information Systems and Applications Conference.