The Risk Level Estimation Based on Deep Learning Method for Tianya Forum

Using the societal risk indicators from socio psychology, a deep learning method is applied to estimate the risk level of Tianya Forum. Due to the effectiveness in semantic and word order information extraction for documents, a deep learning method Post Vector is used to generate the distributed representations of BBS posts. Through the experimental comparison on societal risk classification of BBS posts, the performance of kNN based on Post Vector is superior to kNN based on Bag-of-Words, edit distance or Lucene-based search method. Therefore, with kNN based on Post Vector method and the annotated data of Tianya Zatan broad, the risk level of Baixing Shengyin broad in different months is estimated, and the reasonability of the estimated results is analyzed.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[3]  Xijin Tang,et al.  Text classification based on multi-word with support vector machine , 2008, Knowl. Based Syst..

[4]  Rui Zheng,et al.  The Influence Factors and Mechanism of Societal Risk Perception , 2009, Complex.

[5]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[8]  赵永亮,et al.  A Preliminary Research of Pattern of Users’ Behavior Based on Tianya Forum , 2013 .

[9]  Xijin Tang Exploring on-line societal risk perception for harmonious society measurement , 2013 .

[10]  Yang Hu,et al.  Using Support Vector Machine for Classification of Baidu Hot Word , 2013, KSEM.

[11]  Yong Yu,et al.  Learning Word Representation Considering Proximity and Ambiguity , 2014, AAAI.

[12]  Jindong Chen,et al.  Exploring Societal Risk Classification of the Posts of Tianya Club , 2014, Int. J. Knowl. Syst. Sci..

[13]  唐锡晋,et al.  Societal Risk Classification of Post based on Paragraph Vector and KNN Method , 2014 .

[14]  Tingshao Zhu,et al.  Predicting Personality on Social Media with Semi-supervised Learning , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[15]  Xijin Tang,et al.  Topics and trends of the on-line public concerns based on Tianya forum , 2014 .

[16]  Xiaojun Wan,et al.  Emotion Classification in Microblog Texts Using Class Sequential Rules , 2014, AAAI.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.