The enemy within: Autocorrelation bias in content analysis of narratives

Many content analysis studies involving temporal data are biased by some unknown dose of autocorrelation. The effect of autocorrelation is to inflate or deflate the significant differences that may exist among the different parts of texts being compared. The solution consists in removing effects due to autocorrelation, even if the latter is not statistically significant. Procedures such as Crosbie's (1993) ITSACORR remove the effect of at least first-order autocorrelations and can be used with small samples. The AREG procedure of SPSS (1994) and the AUTOREG procedure of SAS (1993) can be employed to detect and remove first-order autocorrelations, and higher-order ones too in the case of AUTOREG, while several methods specifically intended for small samples (Huitema and McKean, 1991, 1994) have been developed. Four examples of content analysis studies with and without autocorrelation are discussed.

[1]  John M. Gottman,et al.  Time Series Analysis: A Comprehensive Introduction for Social Scientists. , 1983 .

[2]  Colin Martindale The clockwork muse: The predictability of artistic change. , 1991 .

[3]  D P McKenzie,et al.  Autocorrelations and admission diversion. , 1996, Psychiatric services.

[4]  R. Harald Baayen,et al.  Statistical models for word frequency distributions: A linguistic evaluation , 1992, Comput. Humanit..

[5]  Colin Martindale,et al.  On the utility of content analysis in author attribution:The Federalist , 1995, Comput. Humanit..

[6]  Edmond Chow,et al.  A cross-validatory method for dependent data , 1994 .

[7]  Fernand Braudel,et al.  Histoire et Sciences sociales: La longue durée , 1958, Annales. Histoire, Sciences Sociales.

[8]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[9]  Jürgen H. P. Hoffmeyer-Zlotnik,et al.  Text analysis and computers , 1996 .

[10]  Douglas D. Short,et al.  Sixth International Conference on Computers and the Humanities , 1983 .

[11]  Donald W. Zimmerman A Note on Nonindependence and Nonparametric Tests , 1993 .

[12]  R. Hogenraad,et al.  Les mots qui ont fait les relations industrielles , 1994 .

[13]  D.S.G. Pollock The Methods of Time-Series Analysis , 1987 .

[14]  Nancy Ide,et al.  A statistical measure of theme and structure , 1989, Comput. Humanit..

[15]  Lynn Hunt,et al.  Telling the Truth about History , 1995 .

[16]  A K Konopka,et al.  Noncoding DNA, Zipf's law, and language. , 1995, Science.

[17]  Dean Keith Simonton,et al.  Cross-sectional time-series experiments: Some suggested statistical analyses , 1977 .

[18]  D. Sprenkle,et al.  Integrating qualitative and quantitative research methods: a research model. , 1995, Family process.

[19]  Charles W. Ostrom Time Series Analysis: Regression Techniques , 1978 .

[20]  D. Osgood,et al.  Use of pooled time series in the study of naturally occurring clinical events and problem behavior in a foster care setting. , 1994, Journal of consulting and clinical psychology.

[21]  C. Judd,et al.  Data analysis: continuing issues in the everyday analysis of psychological data. , 1995, Annual review of psychology.

[22]  Joseph W. McKean,et al.  Autocorrelation estimation and inference with small samples. , 1991 .

[23]  Yves Bestgen,et al.  Psychology As Literature , 1992 .

[24]  C. Whissell,et al.  A Dictionary of Affect in Language: IV. Reliability, Validity, and Applications , 1986 .

[25]  Colin Martindale,et al.  Fame more fickle than fortune: On the distribution of literary eminence , 1995 .

[26]  Dean Keith Simonton,et al.  Psychology, Science, and History: An Introduction to Historiometry , 1990 .

[27]  Alan N. West,et al.  Primary process content in the King James Bible: The five stages of Christian mysticism , 1991, Comput. Humanit..

[28]  D. A. Kenny,et al.  Consequences of violating the independence assumption in analysis of variance. , 1986 .

[29]  J. Crosbie,et al.  Interrupted time-series analysis with brief single-subject data. , 1993, Journal of consulting and clinical psychology.

[30]  Charles W. Ostrom Time Series Analysis , 1990 .

[31]  Scott M. Smith,et al.  Computer Intensive Methods for Testing Hypotheses: An Introduction , 1989 .

[32]  Michael Stubbs,et al.  COLLOCATIONS AND SEMANTIC PROFILES: ON THE CAUSE OF THE TROUBLE WITH QUANTITATIVE STUDIES , 1995 .

[33]  Robert Hogenraad,et al.  Paper Trails of Psychology - the Words That Made Applied Behavioral-sciences , 1995 .

[34]  Victor Ginsburgh,et al.  The Queen Elisabeth Musical Competition: how fair is the final ranking , 1996 .

[35]  Joseph W. McKean,et al.  Two Reduced-Bias Autocorrelation Estimators: rF1 and rF2 , 1994 .

[36]  W W Tryon,et al.  Estimating and testing autocorrelation with small samples: a comparison of the C-statistic to a modified estimator. , 1993, Behaviour research and therapy.

[37]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[38]  A. Hayes Permutation test is not distribution-free: Testing H₀: ρ = 0. , 1996 .

[39]  Maxim J. Schlossberg,et al.  Time Series Analysis: Regression Techniques (2nd ed.). , 1991 .

[40]  G. A. Young,et al.  Bootstrap: More than a Stab in the Dark? , 1994 .

[41]  Lee Sigelman By their (new) words shall ye know them: Edith Wharton, Marion Mainwaring, andThe Buccaneers , 1995, Comput. Humanit..