A Topic Extraction Method for Network Public Opinion

Network public opinion has played a vital role during the government's decision-making process. There have been many existing topic extraction methods on processing network public opinion, while most of them have paid attention to its short text characteristics and have underutilized its evolutionary characteristics over time. This paper intends to hybrid the textual and evolutionary characteristics during the topic extraction and proposes a phase-based topic extraction (P-TE) method. Firstly, a novel idea about combining qualitative and quantitative methods is developed, and the evolutionary characteristics are used to divide the process into several phases by time series analysis. The number of phases depends on the certain event and differs from each other. Then, based on the textual characteristics of network public opinion, feature extraction is described. Finally, topics are extracted for every single phase separately. The experimental results show that P-TE can reveal more details and careful thoughts about the event than other methods. Furthermore, the rationality of P-TE is verified.

[1]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[2]  Evangelos Kanoulas,et al.  Dynamic Clustering of Streaming Short Documents , 2016, KDD.

[3]  Cheng-Lin Yang,et al.  Mining Hidden Concepts: Using Short Text Clustering and Wikipedia Knowledge , 2014, 2014 28th International Conference on Advanced Information Networking and Applications Workshops.

[4]  Anísio Lacerda,et al.  Topic Modeling for Short Texts with Co-occurrence Frequency-Based Expansion , 2016, 2016 5th Brazilian Conference on Intelligent Systems (BRACIS).

[5]  Xiaohui Yan,et al.  A biterm topic model for short texts , 2013, WWW.

[6]  Xianchao Zhang,et al.  GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts , 2018, IEEE Access.

[7]  Hendri Murfi,et al.  The K-means with mini batch algorithm for topics detection on online news , 2016, 2016 4th International Conference on Information and Communication Technology (ICoICT).

[8]  Fakhri Karray,et al.  Tools and approaches for topic detection from Twitter streams: survey , 2017, Knowledge and Information Systems.

[9]  Alex Po Leung,et al.  Efficient k-means++ with random projection , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[10]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Tinghuai Ma,et al.  Natural disaster topic extraction in Sina microblogging based on graph analysis , 2019, Expert Syst. Appl..

[13]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[14]  Benjamin Letham,et al.  Forecasting at Scale , 2018, PeerJ Prepr..

[15]  Huifang Li,et al.  MR-LDA: An Efficient Topic Model for Classification of Short Text in Big Social Data , 2016, Int. J. Grid High Perform. Comput..