User Interest Modeling Approach Based on Short Text of Micro-blog

In this paper, a method on modeling user's interests based on short text of micro-blog is presented. In order to overcome the lack of information in short text, on the base of analyzing the structure and content of micro-blog short text, this paper proposes an approach on micro-blog short text reconstruction, and namely, according to the other related and the three kinds of special symbols of the text, extends the content, thereby extending the characteristic information of original micro-blog. It takes advantage of HowNet2000 concept dictionary to map the feature set of reconstruction text to a set of concepts. It clusters the set of concepts to divide user's interests, and meanwhile, a representation mechanism of user interest model is presented. Experimental results show that the short text reconstruction and concept mapping can improve the effect of clustering. Compared with the modeling based on collaborative filtering, F-Measure value is increased by 29.1%. This means the proposed micro-blog user's interest modeling has a better performance.