Blog Style Classification: Refining Affective Blogs

In the constantly growing blogosphere with no restrictions on form or topic, a number of writing styles and genres have emerged. Recognition and classification of these styles has become significant for information processing with an aim to improve blog search or sentiment mining. One of the main issues in this field is detection of informative and affective articles. However, such differentiation does not suffice today. In this paper we extend the differentiation and suggest a fine-grained set of subcategories for affective articles. We propose and evaluate a classification method employing novel lexical, morphological, lightweight syntactic and structural features of written text. The results show that our method outperforms the existing approaches.

[1]  Sugiyanto Sugiyanto,et al.  TERM WEIGHTING BASED ON INDEX OF GENRE FOR WEB PAGE GENRE CLASSIFICATION , 2014 .

[2]  Marina Santini,et al.  Characterizing Genres of Web Pages: Genre Hybridism and Individualization , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[3]  Mukesh A. Zaveri,et al.  Automatic Classification of Unstructured Blog Text , 2013 .

[4]  Jussi Karlgren,et al.  Textual Stylistic Variation: Choices, Genres and Individuals , 2022, The Structure of Style.

[5]  Diana Inkpen,et al.  Learning to Classify Documents According to Formal and Informal Style , 2012 .

[6]  Qiang Yang,et al.  Exploring in the weblog space by detecting informative and affective articles , 2007, WWW '07.

[7]  Mária Bieliková,et al.  A Comprehensive Survey and Classification of Approaches for Community Question Answering , 2016, ACM Trans. Web.

[8]  Ionel-Bujorel Pavaloiu,et al.  Topic classification in Romanian blogosphere , 2014, 12th Symposium on Neural Network Applications in Electrical Engineering (NEUREL).

[9]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[10]  Xiaohui Hu,et al.  Word Combination Kernel for Text Classification with Support Vector Machines , 2014, Comput. Informatics.

[11]  Hong Qu,et al.  Automated Blog Classification: Challenges and Pitfalls , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[12]  Christian R. Hoffmann Cohesive Profiling: Meaning and interaction in personal weblogs , 2012 .

[13]  Ravi Kumar,et al.  Structure and evolution of blogspace , 2004, CACM.

[14]  Rebecca Blood We've Got Blog: How Weblogs Are Changing Our Culture , 2002 .

[15]  Mária Bieliková,et al.  News Article Classification Based on a Vector Representation Including Words’ Collocations , 2011 .

[16]  Hassan Sajjad,et al.  Bridging social media via distant supervision , 2015, Social Network Analysis and Mining.

[17]  S. Chenthur Pandian,et al.  An Improved Approach for Topic Ontology Based Categorization of Blogs Using Support Vector Machine , 2012 .

[18]  Mária Bieliková,et al.  Effective hierarchical vector-based news representation for personalized recommendation , 2012, Comput. Sci. Inf. Syst..

[19]  Efstathios Stamatatos,et al.  Open-Set Classification for Automated Genre Identification , 2013, ECIR.

[20]  Anne Nagel Genres On The Web Computational Models And Empirical Studies , 2016 .

[21]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[22]  G. Paltoglou Sentiment Analysis in Social Media , 2014 .

[23]  Maarten de Rijke,et al.  Personal vs non-personal blogs: initial classification experiments , 2008, SIGIR '08.

[24]  Jing Liu,et al.  A computational approach to measuring the correlation between expertise and social media influence for celebrities on microblogs , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[25]  P. Waila,et al.  Sentiment analysis of Movie reviews and Blog posts , 2013, 2013 3rd IEEE International Advance Computing Conference (IACC).

[26]  Jure Leskovec,et al.  Antisocial Behavior in Online Discussion Communities , 2015, ICWSM.

[27]  Pavol Návrat,et al.  Preprocessing of Slovak Blog Articles for Clustering , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[28]  Boleslaw K. Szymanski,et al.  DOCUMENT CLUSTERING WITH BURSTY INFORMATION , 2013 .

[29]  Marián Šimko,et al.  Sentiment analysis on microblog utilizing appraisal theory , 2013, World Wide Web.

[30]  R. C. Joshi,et al.  Semantic tagging and classification of blogs , 2010, 2010 International Conference on Computer and Communication Technology (ICCCT).

[31]  Tao Chen,et al.  Word Embedding Composition for Data Imbalances in Sentiment and Emotion Classification , 2015, Cognitive Computation.

[32]  V. A. Yatsko,et al.  Automatic genre recognition and adaptive text summarization , 2010, Automatic Documentation and Mathematical Linguistics.

[33]  Ramesh Kumar Ayyasamy,et al.  Organizing Information in the Blogosphere: The Use of Unsupervised Approach , 2013 .

[34]  Lawrence D. Fu,et al.  A comprehensive empirical comparison of modern supervised classification and feature selection methods for text categorization , 2014, J. Assoc. Inf. Sci. Technol..

[35]  Denilson Barbosa,et al.  Topic Classification of Blog Posts Using Distant Supervision , 2012 .