Topical Expressivity in Short Texts

With each passing minute, online data is growing exponentially. A bulk of such data is generated from short text social media platforms such as Twitter. Such platforms are fundamental in social media knowledge-based applications like recommender systems. Twitter, for example, provides rich real-time streaming information. Extracting knowledge from such short texts without automated support is not feasible due to Twitter's platform streaming nature. Therefore, an automated method for comprehending patterns in such text is a need for many knowledge systems. This paper provides solutions to generate topics from Twitter data. We present several techniques related to topical modelling to identify topics of interest in short texts. Topic modelling is inherently problematic in shorter texts with very sparse vocabulary in addition to the informal language used in their dissemination. Such findings are informative in knowledge extraction for social media-based recommender systems as well as in understanding tweeters over time.

[1]  Jessica Baldwin-Philippi Using Technology, Building Democracy: Digital Campaigning and the Construction of Citizenship , 2015 .

[2]  Changjun Jiang,et al.  Discovering Canonical Correlations between Topical and Topological Information in Document Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Shuang-Hong Yang,et al.  Large-scale high-precision topic modeling on twitter , 2014, KDD.

[5]  Yang Song,et al.  Topical Keyphrase Extraction from Twitter , 2011, ACL.

[6]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[7]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[8]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[9]  N. Newman The rise of social media and its impact on mainstream journalism , 2009 .

[10]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[11]  Russel Pears,et al.  A Metamodel Enabled Approach for Discovery of Coherent Topics in Short Text Microblogs , 2018, IEEE Access.

[12]  Zhiyuan Liu,et al.  Automatic Keyphrase Extraction via Topic Decomposition , 2010, EMNLP.

[13]  J. Bollen,et al.  More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior , 2013, PloS one.

[14]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[15]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[16]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.