Application of Multilabel Learning Using the Relevant Feature for Each Label in Chronic Gastritis Syndrome Diagnosis

Background. In Traditional Chinese Medicine (TCM), most of the algorithms are used to solve problems of syndrome diagnosis that only focus on one syndrome, that is, single label learning. However, in clinical practice, patients may simultaneously have more than one syndrome, which has its own symptoms (signs). Methods. We employed a multilabel learning using the relevant feature for each label (REAL) algorithm to construct a syndrome diagnostic model for chronic gastritis (CG) in TCM. REAL combines feature selection methods to select the significant symptoms (signs) of CG. The method was tested on 919 patients using the standard scale. Results. The highest prediction accuracy was achieved when 20 features were selected. The features selected with the information gain were more consistent with the TCM theory. The lowest average accuracy was 54% using multi-label neural networks (BP-MLL), whereas the highest was 82% using REAL for constructing the diagnostic model. For coverage, hamming loss, and ranking loss, the values obtained using the REAL algorithm were the lowest at 0.160, 0.142, and 0.177, respectively. Conclusion. REAL extracts the relevant symptoms (signs) for each syndrome and improves its recognition accuracy. Moreover, the studies will provide a reference for constructing syndrome diagnostic models and guide clinical practice.

[1]  中华医学会消化病学分会 中国慢性胃炎共识意见(2006年,上海) , 2007 .

[2]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[3]  Li Guo-chun,et al.  An Investigation into Regularity of Syndrome Classification for Chronic Atrophic Gastritis Based on Structural Equation Model , 2006 .

[4]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[5]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[6]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[7]  Yiqin Wang,et al.  Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine , 2013, Science China Information Sciences.

[8]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[9]  Lei Wang,et al.  [Analysis on similarity between traditional Chinese medicine syndromes and information on disease in patients with post-hepatitis cirrhosis]. , 2009, Zhongguo Zhong xi yi jie he za zhi Zhongguo Zhongxiyi jiehe zazhi = Chinese journal of integrated traditional and Western medicine.

[10]  Guozheng Li,et al.  Modelling of inquiry diagnosis for coronary heart disease in traditional Chinese medicine by using multi-label learning , 2010, BMC complementary and alternative medicine.

[11]  Guo-Zheng Li,et al.  Association analysis and distribution of chronic gastritis syndromes based on associated density , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[12]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[13]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.