The sense of a connection: Automatic tracing of intertextuality by meaning

In literary study, intertextuality refers to the reuse of text, where new meaning or novel stylistic effects have been generated. Most typically in the digital humanities, algorithms for intertextual analysis search for approximate lexical correspondence that can be described as paraphrase. In this article, we look at a complimentary approach that more closely captures the behavior of the reader when faced with meaningful connections between texts in the absence of words that have the same form or stem, which constrains the match to semantics. The technique we employ for identifying such semantic intertextuality is the popular natural language processing strategy of semantic analysis. Unlike the typical scenario for semantic analysis, where a corpus of long form documents is available, we examine the far more limited textual fragments that embody intertextuality. We are primarily concerned with texts from antiquity, where small phrases or passages often form the locus of comparison. In this vein, we look at a specific case study of established parallels between book 1 of Lucan’s Civil War and all of Vergil’s Aeneid. Applying semantic analysis over these texts, we are able to recover parallels that lexical matching cannot, as well as discover new and interesting thematic matches between the two works.

[1]  Jean-Gabriel Ganascia,et al.  Automatic detection of reuses and citations in literary texts , 2014, Lit. Linguistic Comput..

[2]  Maciej Eder,et al.  Does size matter? Authorship attribution, small samples, big problem , 2015, Digit. Scholarsh. Humanit..

[3]  Christopher W. Forstall,et al.  Revealing hidden patterns in the meter of Homer ’ s Iliad , 2012 .

[4]  Glenn H. Roe,et al.  Intertextuality and Influence in the Age of Enlightenment: Sequence Alignment Applications for Humanities Research , 2012, DH.

[5]  Julius B.C. B.C. Caesar,et al.  De Bello Civili I , 2013 .

[6]  Joseph Pucci The Full-Knowing Reader: Allusion and the Power of the Reader in the Western Literary Tradition , 1998 .

[7]  Matthew L. Jockers Macroanalysis: Digital Methods and Literary History , 2013 .

[8]  David M. Mimno,et al.  Computational historiography: Data mining in a century of classics journals , 2012, JOCCH.

[9]  Michael Dewar Lucan: De Bello Ciuili Book 1 , 2011 .

[10]  Lynette Thompson,et al.  Lucan's Use of Virgilian Reminiscence , 1968, Classical Philology.

[11]  Gustav Meyer,et al.  Transactions of the American Philological Association , 1895 .

[12]  Neil Coffee Intertextuality in Latin Poetry , 2013 .

[13]  Mark Olsen,et al.  Something Borrowed: Sequence Alignment and the Identification of Similar Passages in Large Text Collections , 2011 .

[14]  Matthew L. Jockers,et al.  Quantitative formalism: an experiment , 2011 .

[15]  David A. Smith,et al.  Infectious texts: Modeling text reuse in nineteenth-century newspapers , 2013, 2013 IEEE International Conference on Big Data.

[16]  Thomas Eckart,et al.  Unsupervised Detection and Visualisation of Textual Reuse on Ancient Greek Texts , 2010 .

[17]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Lucan Lucan: De bello civili Book II , 1992 .

[20]  Jeffrey Wills,et al.  Repetition in Latin Poetry: Figures of Allusion , 1997, Journal of Roman Studies.

[21]  Walter J. Scheirer,et al.  Evidence of intertextuality: investigating Paul the Deacon's Angustae Vitae , 2011, Lit. Linguistic Comput..

[22]  Nicolas P. Gross,et al.  The Rhetoric of Imitation: Genre and Poetic Memory in Virgil and Other Latin Poets , 1996 .

[23]  Christopher W. Forstall,et al.  Intertextuality in the Digital Age , 2012 .

[24]  Stephen Hinds Allusion and Intertext: Dynamics of Appropriation in Roman Poetry , 1998 .

[25]  Jean-Pierre Koenig,et al.  The Tesserae Project: intertextual analysis of Latin poetry , 2013, Lit. Linguistic Comput..

[26]  Neil Coffee,et al.  Modeling the scholars: Detecting intertextuality through enhanced word-level n-gram matching , 2015, Digit. Scholarsh. Humanit..

[27]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[28]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[29]  Mark Wolff Surveying a Corpus with Alignment Visualization and Topic Modeling , 2013, DH.

[30]  Annette Geßner,et al.  Biblical intertextuality in a digital world: the tool GERTRUDE , 2013, DH-CASE '13.

[31]  Rj Getty,et al.  Lucan de Bello Civili I , 1992 .

[32]  T. Schmitz Modern Literary Theory and Ancient Texts: An Introduction , 2007 .

[33]  John Lee,et al.  A Computational Model of Text Reuse in Ancient Literary Texts , 2007, ACL.

[34]  David Bamman,et al.  The Logic and Discovery of Textual Allusion , 2008 .