Longitudinal Analysis of Linguistic flexibility of Value-motivated Groups

Increasing globalization of the world leads to an emerging need for ways to analysis and understand groups from different cultures and ideologies. Researchers have used written text as a medium to examine political discourse and analyze value-motivated groups. Previous works showed that computational linguistic analysis can be performed to infer the flexibility of value-motivated groups from their writings. The main premise of these works is that text can bring insights into individuals’ and groups’ way of thinking, and potentially, behaviour. While existing works provide viable solutions for characterizing groups’ ideological behaviour, they perform their analyses over all text published by the groups. However, researchers have found that religious and value-motivated groups can’t be analyzed collectively as they regularly evolve. To address this gap, we analyze the performance of existing methods to single documents. Experimental results show that previous features (e.g., use of pronouns and judgment statements) used to predict groups’ flexibility are less predictive for single documents’ flexibility. We show that a newly added feature regarding the identity of a group provides a significant contribution to the prediction process. Furthermore, due to the unbalanced nature of our data, we propose a weighting scheme for linear regression based on the inter-group variance. Results indicate that a weighted least squares significantly outperforms a traditional least squares approach. This work brings new insights into the characteristics of different linguistic and performative signals, and their relationship to the linguistic flexibility of groups. It also provides a decision making support tool for practical use by practitioners.

[1]  Wei Liu,et al.  A study of partial F tests for multiple linear regression models , 2007, Comput. Stat. Data Anal..

[2]  J. A. Rodríguez-Velázquez,et al.  Subgraph centrality in complex networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Donald E. Brown,et al.  Computational analysis of religious and ideological linguistic behavior , 2017, 2017 Systems and Information Engineering Design Symposium (SIEDS).

[4]  Peter Willett,et al.  The Porter stemming algorithm: then and now , 2006, Program.

[5]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[6]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[7]  Eyal Sagi,et al.  Semantic Density Analysis: Comparing Word Meaning across Time and Phonetic Space , 2009 .

[8]  Don A. Moore,et al.  Barriers to Resolution in Ideologically Based Negotiations: The Role of Values and Institutions , 2001 .

[9]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[10]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[11]  Donald E. Brown,et al.  Predicting the tolerance level of religious discourse through computational linguistics , 2016, 2016 IEEE Systems and Information Engineering Design Symposium (SIEDS).

[12]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[13]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[15]  J. McCarthy,et al.  Analyzing the Religious War of Words over Climate Change , 2016 .

[16]  Kent D. Miller Competitive strategies of religious organizations , 2002 .

[17]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Markus Strohmaier,et al.  Analyzing human intentions in natural language text , 2009, K-CAP '09.