The digital age, by making large amounts of text available to us, prompts us to develop new and additional reading strategies supported by the use of computers and enabling us to deal with such amounts of text. One such "distant reading" strategy is stylometry, a method of quantitative text analysis which relies on the frequencies of certain linguistic features such as words, letters or grammatical units to statistically assess the relative similarity of texts to each other and to classify texts on this basis. This method is applied here to French drama of the seventeenth century, more precisely to the now famous "Corneille / Moliere- controversy". In this controversy, some researchers claim that Pierre Corneille wrote several of the plays traditionally attributed to Moliere. The methodological challenge, it is shown here, lies in the fact that categories such as authorship, genre (comedy vs. tragedy) and literary form (prose vs. verse) all have an influence on stylometric distance measures and classification. Cross-genre and cross-form authorship attribution needs to distinguish such competing signals if it is to produce reliable attribution results. This contribution describes two attempts to accomplish this, parameter optimization and feature-range selection. The contribution concludes with some more general remarks about the use of quantitative methods in a hermeneutic discipline such as literary studies.
[1]
John Burrows,et al.
'Delta': a Measure of Stylistic Difference and a Guide to Likely Authorship
,
2002,
Lit. Linguistic Comput..
[2]
John Unsworth.
What is Humanities Computing and What is Not
,
2002
.
[3]
W. Daelemans,et al.
Cross-Genre Authorship Verification Using Unmasking
,
2012,
English Studies.
[4]
Maciej Eder,et al.
Deeper Delta across genres and languages: do we really need the most frequent words?
,
2011,
Lit. Linguistic Comput..
[5]
Cyril Labbé,et al.
Inter-Textual Distance and Authorship Attribution Corneille and Molière
,
2001,
J. Quant. Linguistics.