End-to-End syndrome differentiation of Yin deficiency and Yang deficiency in traditional Chinese medicine

BACKGROUND AND OBJECTIVE Yin and Yang, two concepts adapted from classical Chinese philosophy, play a diagnostic role in Traditional Chinese Medicine (TCM). The Yin and Yang in harmonious balance indicate health, whereas imbalances to either side indicate unhealthiness, which may result in diseases. Yin-yang disharmony is considered to be the cause of pathological changes. Syndrome differentiation of yin-yang is crucial to clinical diagnosis. It lays a foundation for subsequent medical judgments, including therapeutic methods, and formula, among many others. However, because of the complexities of the mechanisms and manifestations of disease, it is difficult to exactly point out which one, yin or yang, is disharmonious. There has been inadequate research conducted on syndrome differentiation of yin and yang from a computational perspective. In this study, we present a computational method, viz. an end-to-end syndrome differentiation of yin deficiency and yang deficiency. METHODS Unlike most previous studies on syndrome differentiation, which use structured datasets, this study takes unstructured texts in medical records as its inputs. It models syndrome differentiation as a task of text classification. This study experiments on two state-of-the-art end-to-end algorithms for text classification, i.e. a classic convolutional neural network (CNN) and fastText. These two systems take the n-grams of several types of tokens as their inputs, including characters, terms, and words. RESULTS When evaluated on a data set with 7326 modern medical records in TCM, it is observed that CNN and fastText generally give rise to comparable performances. The best accuracy rate of 92.55% comes from the system taking inputs as raw as n-grams of characters. It implies that one can build at least a moderate system for the differentiation of yin deficiency and yang deficiency even if he has no glossary or tokenizer at hand. CONCLUSIONS This study has demonstrated the feasibility of using end-to-end text classification algorithms to differentiate yin deficiency and yang deficiency on unstructured medical records.

[1]  Jun Dong,et al.  Wrist pulse signals analysis based on Deep Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology.

[2]  Miao Yu,et al.  A Network-Based Approach to Investigate the Pattern of Syndrome in Depression , 2015, Evidence-based complementary and alternative medicine : eCAM.

[3]  Yonghong Peng,et al.  Data Mining in Real-World Traditional Chinese Medicine Clinical Data Warehouse , 2014 .

[4]  Wei Wang,et al.  Discovering treatment pattern in Traditional Chinese Medicine clinical cases by exploiting supervised topic model and domain knowledge , 2015, J. Biomed. Informatics.

[5]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[6]  Jun Li,et al.  Syndrome classification based on manifold ranking for viral hepatitis , 2013, Chinese Journal of Integrative Medicine.

[7]  Bo Yan,et al.  Classification of tongue color based on CNN , 2017, 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA)(.

[8]  Wanxiang Che,et al.  LTP: A Chinese Language Technology Platform , 2010, COLING.

[9]  Guang Zheng,et al.  Traditional Chinese Medicine Zheng in the Era of Evidence-Based Medicine: A Literature Analysis , 2012, Evidence-based complementary and alternative medicine : eCAM.

[10]  Shi-Ru Zhang,et al.  Human Pulse Recognition Based on Convolutional Neural Networks , 2016, 2016 International Symposium on Computer, Consumer and Control (IS3C).

[11]  Aiping Lu,et al.  Syndrome differentiation in modern research of traditional Chinese medicine. , 2012, Journal of ethnopharmacology.

[12]  Josiah Poon,et al.  Data Analytics for Traditional Chinese Medicine Research , 2014, Springer International Publishing.

[13]  Wei Zhang,et al.  A study of damp-heat syndrome classification using Word2vec and TF-IDF , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[14]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[15]  Li Chen,et al.  Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: An empirical study , 2014, J. Biomed. Informatics.

[16]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[17]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[18]  Yongchao Liu,et al.  A framework and its empirical study of automatic diagnosis of traditional Chinese medicine utilizing raw free-text clinical records , 2012, J. Biomed. Informatics.

[19]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[20]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[21]  Changbo Zhao,et al.  Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective , 2015, Evidence-based complementary and alternative medicine : eCAM.

[22]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[23]  Peng Qian,et al.  Deep Learning Based Syndrome Diagnosis of Chronic Gastritis , 2014, Comput. Math. Methods Medicine.

[24]  Yonghong Peng,et al.  Text mining for traditional Chinese medical knowledge discovery: A survey , 2010, J. Biomed. Informatics.

[25]  Baogang Wei,et al.  Traditional Chinese medicine clinical records classification using knowledge-powered document embedding , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).