Coming to Terms: A Discourse Epistemetrics Study of Article Abstracts from the Web of Science

This study investigates the relative power and characteristics of a set of social and epistemic terms to distinguish among disciplines of research article abstracts, using a corpus of 928,572 abstracts from 13 disciplines indexed by Web of Science in 2011. Applying the machine-learning approach to discourse epistemetrics using a sequential minimal optimization (SMO) algorithm, and a feature set of terms derived from Hyland’s (2005) metadiscourse studies per Demarest and Sugimoto (2014), the current paper reports subsets of terms that best (and least) distinguish among disciplines, finding that the terms least able to distinguish among disciplines are rarely used and overwhelmingly adjectival or adverbial markers of authorial attitude, reflecting personal positioning, while terms best able to distinguish disciplines are mostly verbs frequently used as engagement markers, framing the generation of knowledge for the readership in ways that are standardized within disciplines (while varying among them). We plan to analyze the findings of the current research-in-progress from discipline-based as well as term-based perspectives, incorporating both into a two-mode network, as well as incorporating finer grained data for specific specializations to compare with the current higher-level disciplinary findings.

[1]  Kevin W. Boyack,et al.  Toward a consensus map of science , 2009, J. Assoc. Inf. Sci. Technol..

[2]  Ying Ding,et al.  Scholarly network similarities: How bibliographic coupling networks, citation networks, cocitation networks, topical networks, coauthorship networks, and coword networks relate to each other , 2012, J. Assoc. Inf. Sci. Technol..

[3]  K. Hyland,et al.  Metadiscourse: Exploring Interaction in Writing , 2005 .

[4]  J. Dodick,et al.  Conjunction and Modal Assessment in Genre Classification : A Corpus-Based Study of Historical and Experimental Science Writing , 2004 .

[5]  Ismael Rafols,et al.  A global map of science based on the ISI subject categories , 2009, J. Assoc. Inf. Sci. Technol..

[6]  K. Hyland,et al.  Metadiscourse in academic writing: A reappraisal , 2004 .

[7]  Jeannett Martin,et al.  The Language of Evaluation: Appraisal in English , 2005 .

[8]  Cassidy R. Sugimoto,et al.  Argue, observe, assess: Measuring disciplinary identities and differences through socio‐epistemic discourse , 2015, J. Assoc. Inf. Sci. Technol..

[9]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[10]  D. Biber,et al.  Styles of stance in English: Lexical and grammatical marking of evidentiality and affect , 1989 .

[11]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[12]  Stefan Kaufmann,et al.  Language and Ideology in Congress , 2011, British Journal of Political Science.

[13]  Kevin W. Boyack,et al.  Mapping the backbone of science , 2004, Scientometrics.

[14]  A. Biglan The characteristics of subject matter in different academic areas. , 1973 .

[15]  Michael Halliday,et al.  An Introduction to Functional Grammar , 1985 .