Online Public Opinion System: Design and Applications

Online Public Opinion Systems (OPOS) target at collecting, analyzing, summarizing and monitoring massive public opinions on the Internet in real time. Meanwhile, OPOS often have the ability to identify the key or sudden events, and thus notify related people immediately for rapid responses to these events. As part of this endeavor, this paper introduces the architecture and techniques of an OPOS that has been used by several large enterprises. This self-designed OPOS generally contains data layer, computation layer and application layer from bottom to up. We first introduce the basic function and key techniques of each layer, and then present several typical yet important algorithms on the computation layer. Experimental results on real-world data validate the effectiveness of algorithms fixed in our system. Last but not the least, a system demonstration in a ship-building company is provided to justify the value of our OPOS for real enterprises.

[1]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[2]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[3]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[4]  R. Ciupa,et al.  International Conference , 2023, In Vitro Cellular & Developmental Biology - Animal.

[5]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[6]  Guang Yu,et al.  Research and application of public opinion retrieval based on user behavior modeling , 2015, Neurocomputing.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Jin-Soo Kim,et al.  HAMA: An Efficient Matrix Computation with the MapReduce Framework , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[9]  René Peinl,et al.  Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j , 2013, EDBT '13.

[10]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[11]  Kun Yang,et al.  Dynamic non-parametric joint sentiment topic mixture model , 2015, Knowl. Based Syst..

[12]  Tim Hawkins,et al.  Introduction to MongoDB , 2013 .

[13]  Bai Lin Xie,et al.  Uncertain Internet Public Opinion Emergency Decision-Making Method under Interval-Valued Fuzzy Environment , 2015 .

[14]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .

[15]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[16]  Carl Boettiger,et al.  An introduction to Docker for reproducible research , 2014, OPSR.

[17]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[18]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[19]  Jing Wang,et al.  Scrapy-Based Crawling and User-Behavior Characteristics Analysis on Taobao , 2012, 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[20]  Michele Kimpton,et al.  An open source archival quality web crawler , 2004 .

[21]  Ma Junhong,et al.  Design and Implementation of Network Public Opinion Analysis System , 2015 .

[22]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[23]  Milind A. Bhandarkar,et al.  MapReduce programming with apache Hadoop , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[24]  Matei A. Zaharia,et al.  An Architecture for and Fast and General Data Processing on Large Clusters , 2016 .

[25]  Xianghua Fu,et al.  Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon , 2013, Knowl. Based Syst..

[26]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[27]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.