Stylistic Analysis of Text Submissions to Japanese Q & A Communities*

Abstract This study is a mixed method, i.e. qualitative as well as quantitative, analysis of the stylistic characteristics of texts submitted to Japanese Q & A communities. Along with the development of social media, Q & A communities are attracting much scholarly attention as important resources for analysing online communication. In Q & A communities, people freely submit questions and answers; questions are classified into subject categories; and the best answers are selected. In this study, we analyse the stylistic characteristics of three types of submission, i.e. questions, best answers, and normal answers, in two different subject categories, i.e. “personal computers and related devices” and “love and human relations advice”. The results show that the textual styles clearly distinguished these six classes of texts and clarified their respective characters. Our findings provide useful knowledge about how people differ in their communication styles regarding subject categories and on how people select communication styles. This study will contribute to research into discovering current online communication styles.

[1]  Sanghee Oh,et al.  Users' relevance criteria for evaluating answers in a social Q&A site , 2009, J. Assoc. Inf. Sci. Technol..

[2]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[3]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[4]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[5]  Antonio Miranda García,et al.  Function Words in Authorship Attribution Studies , 2007, Lit. Linguistic Comput..

[6]  John Burrows,et al.  Questions of Authorship: Attribution and Beyond A Lecture Delivered on the Occasion of the Roberto Busa Award ACH-ALLC 2001, New York , 2003, Comput. Humanit..

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Takafumi Suzuki Extracting speaker-specific functional expressions from political speeches using random forests in order to investigate speakers' political styles , 2009 .

[9]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[10]  John Burrows,et al.  'Delta': a Measure of Stylistic Difference and a Guide to Likely Authorship , 2002, Lit. Linguistic Comput..

[11]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[12]  Jeffrey Pomerantz,et al.  A linguistic analysis of question taxonomies , 2005, J. Assoc. Inf. Sci. Technol..

[13]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[14]  F. Maxwell Harper,et al.  Facts or friends?: distinguishing informational and conversational questions in social Q&A sites , 2009, CHI.

[15]  Shlomo Argamon,et al.  Stylistic text classification using functional lexical features , 2007, J. Assoc. Inf. Sci. Technol..

[16]  Jack Grieve,et al.  Quantitative Authorship Attribution: An Evaluation of Techniques , 2007, Lit. Linguistic Comput..

[17]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[18]  Shlomo Argamon,et al.  Computational methods in authorship attribution , 2009, J. Assoc. Inf. Sci. Technol..

[19]  John W. Creswell,et al.  Designing and Conducting Mixed Methods Research , 2006 .