Semi-supervised topic learning and representation method based on association rules and metadata

Aiming at the problem that the semantic explanation of the existing topic model is poor and the accuracy is not high, a semi-supervised topic learning and representation method based on association rules and metadata is proposed. First, we used the metadata as a priori knowledge to guide the topic learning, and got the probability distribution of the term in the document. Then, we got the frequent three items of each topic by weighted association rule. And then used the metadata of the experimental document to improve the semantic similarity through the improved vector space model algorithm. Finally, we got the topic semantics which are more in line with the actual situation and have better semantic explanation. On the same data set, LDA topic model representation method and this method were used to compare experiments. The experimental results show that the method proposed in this paper is superior to the LDA topic model representation in terms of topic extraction accuracy and topic granularity, and fully validates the effectiveness of the proposed method.