Bioinformatics and Classical Literary Study

This paper describes a collaborative project between classicists, quantitative biologists, and computer scientists to apply ideas and methods drawn from the sciences to the study of literature. A core goal of the project is the use of computational biology, natural language processing, and machine learning techniques to investigate intertextuality, reception, and related phenomena of literary significance. As a case study in our approach, here we describe the use of sequence alignment, a common technique in genomics, to detect intertextuality in Latin literature. Sequence alignment is distinguished by its ability to find inexact verbal parallels, which makes it ideal for identifying phonetic resemblances in large corpora of Latin texts. Although especially suited to Latin, sequence alignment in principle can be extended to many other languages.

[1]  Christopher W. Forstall,et al.  Intertextuality in the Digital Age , 2012 .

[2]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[3]  Heather F. Windram,et al.  Phylomemetics—Evolutionary Analysis beyond the Gene , 2011, PLoS biology.

[4]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[5]  Eugene W. Myers,et al.  Basic local alignment search tool. Journal of Molecular Biology , 1990 .

[6]  L. Landrey Skeletons in Armor: Silius Italicus’ Punica and the Aeneid’s Proem , 2014 .

[7]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[8]  Nello Cristianini,et al.  Content analysis of 150 years of British periodicals , 2017, Proceedings of the National Academy of Sciences.

[9]  Nigel Morgan,et al.  A Critical Inquiry , 2010 .

[10]  Mark Olsen,et al.  Something Borrowed: Sequence Alignment and the Identification of Similar Passages in Large Text Collections , 2011 .

[11]  Björn-Olav Dozo,et al.  Quantitative Analysis of Culture Using Millions of Digitized Books , 2010 .

[12]  Jorge A. Bonilla Lopez,et al.  Strings, Triangles, and Go-betweens: Intertextual Approaches to Silius’ Carthaginian Debates , 2015 .

[13]  Jean-Pierre Koenig,et al.  The Tesserae Project: intertextual analysis of Latin poetry , 2013, Lit. Linguistic Comput..

[14]  John L. Heller,et al.  Transactions of the American Philological Association , 1947 .

[15]  J. Dahlberg,et al.  Molecular biology. , 1977, Science.

[16]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .