Bridge the gap between syndrome in Traditional Chinese Medicine and proteome in western medicine by unsupervised pattern discovery algorithm

Studying the molecular basis of syndrome in traditional Chinese medicine (TCM) is a research hotspot and a challenge for medicine society. In this paper, we combine clinical epidemiology, proteome technique and data mining research to investigate the molecular basis of syndrome. We do a clinical epidemiology survey of coronary heart disease to collect case patients and control patients. We also analysis the two-dimensional electrophoresis results of blood samples of included patients to find out the proteins with significant expression. We find out that the blood stasis syndrome has significant association with 10 inflammatory factors proteins. Based on the collected data, we proposed an unsupervised pattern discovery algorithm to detect the significantly associated patterns in the data. 14 patterns containing syndrome and proteins are retrieved, which can be considered as the evidence of association between syndrome of TCM and proteome. Furthermore, we validate the unsupervised pattern discovery results by combining support vector machine and 10-fold cross validation, finding that the accuracy of classifying is higher than 90%, which indicates that the pattern discovery results is believable. The research effort here presents a better insight to the integration of TCM and western medicine and develops a better way to study the molecular basis of syndrome.