From Words to Numbers: A Generalized and Linguistics-Based Coding Procedure for Collecting Textual Data

In this paper, I describe a new, powerful technique for coding data from textual sources, a technique based on concepts developed in thefield of linguistics, particularly on the concept of semantic text grammar. I develop a grammar of data collection from the simple linguistic canonical structure noun phraselverb phrase, namely, subjectlactionlobject and their modifiers. I show that the grammar allows researchers to collect richer and more flexible data than more traditional coding schemes. In particular, the grammar produces coded output that, to a large extent, preserves both the syntax and the lexicon of the source material. Furthermore, the grammar makes the process of data collection independent of any prior specification of hypotheses, because virtually all relevant information provided by the sources can be easily coded.

[1]  J. Lyons New horizons in linguistics , 1972 .

[2]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[3]  Jerry R. Hobbs,et al.  Interpreting discourse: Coherence and the analysis of ethnographic interviews , 1982 .

[4]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[5]  Michael Agar,et al.  stories, background knowledge and themes: problems in the analysis of life history narrative , 1980 .

[6]  E. Kozminsky,et al.  Altering comprehension: The effect of biasing titles on text comprehension , 1977, Memory & cognition.

[7]  Gilbert Shapiro,et al.  Toward the Integration of Content Analysis and General Methodology , 1975 .

[8]  Roberto Franzosi,et al.  The Press as a Source of Socio-Historical Data: Issues in the Methodology of Data Collection from Newspapers , 1987 .

[9]  Judith R. Johnston,et al.  Cognitive prerequisites: The evidence from children learning English. , 1985 .

[10]  D. Mcadam Political Process and the Development of Black Insurgency, 1930-1970 , 1982 .

[11]  Donald C. Bryant,et al.  Content Analysis of Communications , 1968 .

[12]  Charles Tilly,et al.  As sociology meets history , 1983 .

[13]  Lewis M. Norton,et al.  Automated Analysis of Instructional Text , 1983, Artif. Intell..

[14]  K. Krippendorff Krippendorff, Klaus, Content Analysis: An Introduction to its Methodology . Beverly Hills, CA: Sage, 1980. , 1980 .

[15]  Ted Robert Gurr,et al.  CROSS-NATIONAL STUDIES OF CIVIL VIOLENCE, , 1969 .

[16]  V. Dijk,et al.  Some Aspects Of Text Grammars , 1972 .

[17]  E. Mishler Research Interviewing: Context and Narrative , 1986 .

[18]  D. Rumelhart NOTES ON A SCHEMA FOR STORIES , 1975 .

[19]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .