A Topic Detection Method for Chinese Microblog

A model for topic detection in Chinese microblog is proposed. Based on the traditional vector space model, a feature selection and weight computation method are introduced to express messages in microblog due to their special characteristics. We also introduce a scoring method for the tweets which can filter out most topic-unrelated tweets at first, in order to minimize the impact of noise. Then a topic detection algorithm is proposed, using a new vector distance computation method. The results show that our method can filter out almost all the topic-unrelated tweets and identify topics in microblog accurately and efficiently. The study of topic detection method in microblog can help users and governments to find out hot topics dynamically.