Exploring Cities in Crime: Significant Concordance and Co-occurrence in Quantitative Literary Analysis

We present CoocViewer, a graphical analysis tool for the purpose of quantitative literary analysis, and demonstrate its use on a corpus of crime novels. The tool displays words, their significant co-occurrences, and contains a new visualization for significant concordances. Contexts of words and co-occurrences can be displayed. After reviewing previous research and current challenges in the newly emerging field of quantitative literary research, we demonstrate how CoocViewer allows comparative research on literary corpora in a project-specific study, and how we can confirm or enhance our hypotheses through quantitative literary analysis.

[1]  Jonathan Culpeper Computers, language and characterisation : an analysis of six characters in Romeo and Juliet. , 2002 .

[2]  Matthew L. Jockers,et al.  Quantitative formalism: an experiment , 2011 .

[3]  Owen Rambow,et al.  Social Network Analysis of Alice in Wonderland , 2012, CLfL@NAACL-HLT.

[4]  Franco Moretti Graphs, Maps, Trees: Abstract Models for a Literary History , 2005 .

[5]  Hugh Craig Stylistic Analysis and Authorship Studies , 2007 .

[6]  Dominic Widdows,et al.  Visualisation Techniques for Analysing Meaning , 2002, TSD.

[7]  C. Tribble What are concordances and how are they used , 2010 .

[8]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[9]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[10]  Jonathan Culpeper Keyness: words, parts-of-speech and semantic categories in the character-talk of Shakespeare's "Romeo and Juliet" , 2009 .

[11]  Hugh Craig Jonsonian Chronology and the Styles of A Tale of a Tub , 1999 .

[12]  J. Burrows,et al.  Computation into Criticism: A Study of Jane Austen's Novels and an Experiment in Method , 1989 .

[13]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[14]  David L. Hoover,et al.  Statistical Stylistics and Authorship Attribution: an Empirical Investigation , 2001, Lit. Linguistic Comput..

[15]  J. Burrows Computation into criticism : a study of Jane Austen's novels and an experiment in method , 1987 .

[16]  Bettina Fischer-Starcke Corpus Linguistics in Literary Analysis: Jane Austen and her Contemporaries , 2010 .

[17]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[18]  Tanya E. Clement 'A thing not beginning and not ending': using digital tools to distant-read Gertrude Stein's The Making of Americans , 2008, Lit. Linguistic Comput..

[19]  K. Bretonnel Cohen,et al.  Visualization and Language Processing for Supporting Analysis across the Biomedical Literature , 2010, KES.

[20]  Michaela Mahlberg Corpus stylistics: bridging the gap between linguistic and literary studies , 2007 .

[21]  H. P. Luhn Key word‐in‐context index for technical literature (kwic index) , 1960 .

[22]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[23]  Ray Siemens,et al.  A Companion to Digital Literary Studies: Blackwell Companions to Literature and Culture , 2008 .

[24]  C. W. F. McKenna,et al.  The Statistical Analysis of Style: Reflections on Form, Meaning, and Ideology in the 'Nausicaa' Episode of Ulysses , 2001, Lit. Linguistic Comput..

[25]  John Burrows,et al.  All the Way Through: Testing for Authorship in Different Frequency Strata , 2007, Lit. Linguistic Comput..

[26]  Keywords and frequent phrases of Jane Austen's "Pride and Prejudice": a corpus-stylistic analysis , 2009 .

[27]  David L. Hoover Frequent Word Sequences and Statistical Stylistics , 2002, Lit. Linguistic Comput..

[28]  M. Stubbs Conrad in the computer: examples of quantitative stylistic methods , 2005, The Language and Literature Reader.

[29]  Ray Siemens,et al.  A companion to digital literary studies , 2007 .

[30]  T. D. Haen Linguistics and the study of literature , 1986 .

[31]  Martina Löw The intrinsic logic of cities: towards a new theory on urbanism , 2012 .

[32]  Chris Biemann,et al.  Exploiting the Leipzig Corpora Collection , 2006 .

[33]  Martina Löw The City as Experiential Space: The Production of Shared Meaning , 2013 .

[34]  Michael Barlow MonoConc 1.5 and ParaConc , 1999 .

[35]  Rosanne G. Potter,et al.  Literary criticism and literary computing: The difficulties of a synthesis , 1988, Comput. Humanit..

[36]  Adam Kilgarriff,et al.  The Sketch Engine , 2004 .

[37]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[38]  Helmuth Berking The distinctiveness of cities Outline of a research programme , 2012 .

[39]  Franco Moretti,et al.  Style, Inc. Reflections on Seven Thousand Titles (British Novels, 1740–1850) , 2009, Critical Inquiry.

[40]  Franco Moretti Network theory, plot analysis , 2011 .

[41]  Douglas Biber,et al.  Corpus linguistics and the study of literature: Back to the future? , 2011 .

[42]  Verena Lyding,et al.  Corpus Clouds - Facilitating Text Analysis by Means of Visualizations , 2009, LTC.

[43]  Michaela Mahlberg,et al.  Corpus Stylistics and Dickens's Fiction , 2012 .

[44]  Kathleen McKeown,et al.  Extracting Social Networks from Literary Fiction , 2010, ACL.

[45]  D. Hoover Quantitative Analysis and Literary Studies , 2013 .

[46]  Franco Moretti,et al.  GRAPHS, MAPS, TREES , 2003 .

[47]  堀 正広 Investigating Dickens' style : a collocational analysis , 2004 .