Bursty Events Detection Approach on Chinese Microblog Based on Splay Tree Optimization

As a new media, Microblog plays an important role in people's daily life. People can gain, publish and share information in time, and interact with other users. With the characteristics of small text, spreading fast and great impact, an increasing number of scholars choose Microblog as their research object. Since Microblog generates a huge amount of data, bursty events detection seems to be very important. An algorithm based on the splay tree optimization is proposed so as to reduce the burst words clustering by recording the existing results. This paper adopts the method of trade space for time to optimize the algorithm, which changes the original complexity of the burst words clustering algorithm O(SN^3) into O(SN^2 log(N). The worst space complexity becomes O(N^2), however, it turns to be O(LN) by merely recording the effective results combined with the actual situation. Value L is related to the scale of cluster in the clustering result.

[1]  Jianxin Li,et al.  Bursty event detection from microblog: a distributed and incremental approach , 2016, Concurr. Comput. Pract. Exp..

[2]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[3]  Yanxiang He,et al.  Microblog bursty topic detection based on user relationship , 2011, 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference.

[4]  Miao Duoqian,et al.  News Topic Detection Approach on Chinese Microblog , 2012 .

[5]  Yue Lu,et al.  Enriching text representation with frequent pattern mining for probabilistic topic modeling , 2012, ASIST.

[6]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[7]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[8]  Trs Information Research on Chinese Micro-blog Bursty Topics Detection , 2013 .

[9]  T. Murata,et al.  Breaking News Detection and Tracking in Twitter , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[10]  Guo Yixi Bursty topics detection approach on Chinese microblog based on burst words clustering , 2014 .

[11]  Michael Grossniklaus,et al.  Evaluation Measures for Event Detection Techniques on Twitter Data Streams , 2015, BICOD.

[12]  Tong Wei EDM: An Efficient Algorithm for Event Detection in Microblogs , 2012 .

[13]  Qian Zhang,et al.  Topical differences between Chinese language Twitter and Sina Weibo , 2016, WWW.

[14]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[15]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[16]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[17]  Mitsuru Ishizuka,et al.  Topic extraction from news archive using TF*PDF algorithm , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..