2. Sequence Comparison Viaalignment and Gibbs Sampling: A Formal Analysis of the Emergence of the Modern Sociological Article

Various substantive literatures in sociology seek small regularities in sequences: turning points in the life course, catalytic moments in organizational change, sharp turns in occupational trajectories, and the like. Commonly these are turning points, but they may also be simple local patterns. This paper reports a method for discovering such regularities even when they are quite faint, applying that method to rhetorical regularities in sociological articles. The paper begins by analyzing the overall sequence structure of such articles and then gives a basic introduction to Gibbs sampling, one member of the broader class of Markov chain Monte Carlo (MCMC) methods. It then reports an algorithm employing Gibbs sampling to find local sequence regularities and applies that algorithm to demonstrate the subsequence regularities present in sociological articles. Substantively, the paper shows that the rhetorical structure of sociological articles changed from one pattern to another in the period 1895–1965 and that certain faint but standard rhetorical subsequences became characteristic of articles in the later period. Methodologically, it introduces a broad class of methods that provide effective approaches to a number of previously intractable statistical questions.

[1]  A. F. J. van Raan,et al.  Handbook of quantitative studies of science and technology , 1988 .

[2]  D. Gusfield Efficient methods for multiple sequence alignment with guaranteed error bounds , 1993 .

[3]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[4]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[5]  P. Sorokin,et al.  The Study of War , 1942, Ethics.

[6]  Mercedes Jaime Sisó The rhetorical structure of scientific articles , 1992 .

[7]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[8]  M S Waterman,et al.  Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.

[9]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[10]  Jun S. Liu,et al.  Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling Strategies , 1995 .

[11]  Andrew Abbott,et al.  A Comment on “Measuring the Agreement between Sequences” , 1995 .

[12]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  A. Weigert,et al.  A Study of Articles on Religion in Major Sociology Journals: Some Preliminary Findings , 1972 .

[14]  Brian Everitt,et al.  Cluster analysis , 1974 .

[15]  S F Altschul,et al.  Statistical methods and insights for protein and DNA sequences. , 1991, Annual review of biophysics and biophysical chemistry.

[16]  Richard A. Garnett The study of war in American sociology: An analysis of selected journals, 1936 to 1984 , 1988 .

[17]  A. Abbott,et al.  Measuring Resemblance in Sequence Data: An Optimal Matching Analysis of Musicians' Careers , 1990, American Journal of Sociology.

[18]  R. Connors The Rhetoric of Explanation , 1984 .

[19]  A. Abbott Sequence analysis: new methods for old ideas , 1995 .

[20]  Osamu Gotoh,et al.  Optimal alignment between groups of sequences and its application to multiple sequence alignment , 1993, Comput. Appl. Biosci..

[21]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[22]  P. Sorokin The Study of War , 1943, Ethics.

[23]  M Ishikawa,et al.  Multiple sequence alignment by parallel simulated annealing , 1993, Comput. Appl. Biosci..

[24]  Clark McPhail,et al.  The Manuscript Review and Decision-Making Process , 1987 .

[25]  Nicholas C. Mullins,et al.  THE STRUCTURAL ANALYSIS OF A SCIENTIFIC PAPER , 1988 .

[26]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[27]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[28]  M. Gribskov,et al.  Sequence Analysis Primer , 1991 .

[29]  Jun S. Liu,et al.  Gibbs motif sampling: Detection of bacterial outer membrane protein repeats , 1995, Protein science : a publication of the Protein Society.

[30]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[31]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[32]  M. Boguski Computational sequence analysis revisited: new databases, software tools, and the research opportunities they engender. , 1992, Journal of lipid research.

[33]  R. Connors The Rhetoric of Explanation , 1985 .

[34]  Andrew Abbott,et al.  From Causes to Events , 1992 .

[35]  S. Henikoff,et al.  Position-based sequence weights. , 1994, Journal of molecular biology.

[36]  Jun S. Liu,et al.  Covariance Structure and Convergence Rate of the Gibbs Sampler with Various Scans , 1995 .

[37]  Michael Clyne,et al.  Cultural differences in the organization of academic texts: English and German , 1987 .

[38]  Dean J. Champion,et al.  A Content Analysis of Book Reviews in the AJS, ASR, and Social Forces , 1973, American Journal of Sociology.

[39]  S. E. Hills,et al.  Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling , 1990 .

[40]  M Vingron,et al.  Weighting in sequence space: a comparison of methods in terms of generalized sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Andrew L. Rukhin,et al.  Tools for statistical inference , 1991 .

[42]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[43]  Joost Kircz,et al.  Rhetorical Structure of Scientific Articles: the Case for Argumentational Analysis in Information Retrieval , 1991, J. Documentation.

[44]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[45]  John O'Neill,et al.  The Literary Production of Natural and Social Science Inquiry: Issues and Applications in the Social Organization of Science , 1981 .

[46]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[47]  G. Kinloch American sociology’s changing interests as reflected in two leading journals , 1988 .