Community discovery and sentiment mining for Chinese BLOG

Since the emergence of BLOG, it not only represents a new network technology, but also means the beginning of a new life style. How to utilize and mine the BLOG content which contains hidden sentiment and real-time update is a big challenge in the data-mining domain. As most of the existing method for network text's topic mining is achieved through clustering text's topic and label which are labeled by hand and often mismatched with the content of the text, the accuracy rating of this method is low. In this paper, we present a BLOG community mining method based on topic features. We define topic features as representation for the major contents of BLOG. Meanwhile, we present a BLOG hidden sentiment continuous evaluation algorithm, called CSETR (Continuous sentiment evaluation with time regulator). CSETR can be used to evaluate the continuous sentiment scores of topic features. Furthermore, it also consider changes in sentiment intensity over time. Lastly, through CSETR we will build up sentiment trend model for Chinese BLOG community.

[1]  Oi Yee Kwong,et al.  Morpheme-based Derivation of Bipolar Semantic Orientation of Chinese Words , 2004, COLING.

[2]  John Carroll,et al.  Unsupervised Classification of Sentiment and Objectivity in Chinese Text , 2008, IJCNLP.

[3]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[4]  Mike Wells,et al.  Structured Models for Fine-to-Coarse Sentiment Analysis , 2007, ACL.

[5]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[6]  Yun Chi,et al.  Blog Community Discovery and Evolution Based on Mutual Awareness Expansion , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[7]  James Allan,et al.  Extracting significant time varying features from text , 1999, CIKM '99.

[8]  Yun Chen,et al.  Machine learning techniques for business blog search and mining , 2008, Expert Syst. Appl..

[9]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[10]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[11]  Wan-Jui Lee,et al.  Finding Friend Groups in Blogosphere , 2008, 22nd International Conference on Advanced Information Networking and Applications - Workshops (aina workshops 2008).

[12]  Wen Shi,et al.  Sentiment Classification for Movie Reviews in Chinese by Improved Semantic Oriented Approach , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[13]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[14]  Ruwei Dai,et al.  AMAZING: A sentiment mining and retrieval system , 2009, Expert Syst. Appl..

[15]  Tom B. Y. Lai,et al.  Polarity Classification of Celebrity Coverage in the Chinese Press , 2005 .

[16]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[17]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[18]  Janyce Wiebe,et al.  Development and Use of a Gold-Standard Data Set for Subjectivity Classifications , 1999, ACL.

[19]  T. V. Prabhakar,et al.  Sentence Level Sentiment Analysis in the Presence of Conjuncts Using Linguistic Analysis , 2007, ECIR.

[20]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[21]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.