Web Information Systems Engineering – WISE 2016

Many of today’s online news websites and aggregator apps have enabled users to publish their opinions without respect to time and place. Existing works on topic-based sentiment analysis of product reviews cannot be applied to online news directly because of the following two reasons: (1) The dynamic nature of news streams require the topic and sentiment analysis model also to be dynamically updated. (2) The user interactions among news comments can easily lead to inaccurate topic and sentiment extraction. In this paper, we propose a novel probabilistic generative model (DTSA) to extract topics and the specified sentiments from news streams and analyze their evolution over time simultaneously. DTSA incorporates a multiple timescale model into a generative topic model. Additionally, we further consider the links among news comments to avoid the error caused by user interactions. Finally, we derive distributed online inference procedures to update the model with newly arrived data and show the effectiveness of our proposed model on real-world data sets.

[1]  Zi Huang,et al.  Joint Modeling of Users' Interests and Mobility Patterns for Point-of-Interest Recommendation , 2015, ACM Multimedia.

[2]  Andrew McCallum,et al.  Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[3]  Howard B. Newcombe,et al.  Handbook of record linkage: methods for health and statistical studies, administration, and business , 1988 .

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Philip B. Crosby,et al.  Quality Is Free: The Art of Making Quality Certain , 1979 .

[6]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[7]  Raghav Kaushik,et al.  On active learning of record matching packages , 2010, SIGMOD Conference.

[8]  Chi-Yin Chow,et al.  Spatiotemporal Sequential Influence Modeling for Location Recommendations , 2015, ACM Trans. Intell. Syst. Technol..

[9]  Hector Garcia-Molina,et al.  Incremental entity resolution on rules and data , 2014, The VLDB Journal.

[10]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Matthew A. Jaro,et al.  Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .

[12]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Hao Wang,et al.  Adapting to User Interest Drift for POI Recommendation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[15]  Hong Peng,et al.  Improving Uncertain Data-Quality through Effective Use of Knowledge Base , 2008, 2008 4th International Conference on Wireless Communications, Networking and Mobile Computing.

[16]  Craig A. Knoblock,et al.  Learning Blocking Schemes for Record Linkage , 2006, AAAI.

[17]  B Wieder,et al.  The Impact of Business Intelligence on the Quality of Decision Making - A Mediation Model , 2015, CENTERIS/ProjMAN/HCist.

[18]  Jun Wang,et al.  Evaluation of Quality Measure Factors for the Middleware Based Context-Aware Applications , 2012, 2012 IEEE/ACIS 11th International Conference on Computer and Information Science.

[19]  Stefan M. Rüger,et al.  Weakly Supervised Joint Sentiment-Topic Detection from Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[20]  Raymond J. Mooney,et al.  Adaptive Blocking: Learning to Scale Up Record Linkage , 2006, Sixth International Conference on Data Mining (ICDM'06).

[21]  William W. Cohen Data integration using similarity joins and a word-based information representation language , 2000, TOIS.

[22]  Weiyi Meng,et al.  Efficient SPectrAl Neighborhood blocking for entity resolution , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[23]  Chi-Yin Chow,et al.  LORE: exploiting sequential influence for location recommendations , 2014, SIGSPATIAL/GIS.

[24]  Shazia Wasim Sadiq,et al.  Joint Modeling of User Check-in Behaviors for Point-of-Interest Recommendation , 2015, CIKM.

[25]  Alejandro A. Vaisman,et al.  Rule-Based Multidimensional Data Quality Assessment Using Contexts , 2016, DaWaK.

[26]  Anirban Dasgupta,et al.  Optimal hashing schemes for entity matching , 2013, WWW.

[27]  Ling Chen,et al.  LCARS , 2014, ACM Trans. Inf. Syst..

[28]  Nadia Magnenat-Thalmann,et al.  Time-aware point-of-interest recommendation , 2013, SIGIR.

[29]  Yizhou Sun,et al.  LCARS: a location-content-aware recommender system , 2013, KDD.

[30]  Hongzhi Yin,et al.  Spatio-Temporal Recommendation in Social Media , 2016, SpringerBriefs in Computer Science.

[31]  Craig A. Knoblock,et al.  Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.

[32]  Yan Zhao,et al.  Sentiment Analysis on News Comments Based on Supervised Learning Method , 2014, MUE 2014.

[33]  Surajit Chaudhuri,et al.  Example-driven design of efficient record matching queries , 2007, VLDB.

[34]  Chengqi Zhang,et al.  Modeling Location-Based User Rating Profiles for Personalized Recommendation , 2015, ACM Trans. Knowl. Discov. Data.

[35]  Raymond J. Mooney,et al.  Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.