Multi-aspect Blog Sentiment Analysis Based on LDA Topic Model and Hownet Lexicon

Blog is an important web2.0 application, which attracts many users to express their subjective reviews about financial events, political events and other objects. Usually a Blog page includes more than one theme. However the existing researches of multi-aspect sentiment analysis focus on the product reviews. In this paper, we propose a multi-aspect Chinese Blog sentiment analysis method based on LDA topic model and Hownet lexicon. At first, we use a Chinese Blog corpus to train a LDA topic model and identify the themes of this corpus. Then the LDA model which has been trained is used to segment the themes of Blog pages with paragraphs. After that the sentiment word tagging method based on Hownet is used to calculate the sentiment orientation of every Blog theme. So the sentiment orientation of the Blog pages can be represented by the sentiment orientation of multi-aspect Blog themes. The experiment results on SINA Blog dataset show our method not only gets good topic segments, but also improves the sentiment classification performance.