Metadata that serve as semantic markup, such as conceptual categories that describe the macrostructure of a plot in terms of actors and their mutual relationships, actions, and their ingredients annotated in folk narratives, are important additional resources of digital humanities research. Traditionally originating in structural analysis, in fairy tales they are called functions (Propp, 1968), whereas in myths – mythemes (Levi-Strauss, 1955); a related, overarching type of content metadata is a folklore motif (Uther, 2004; Jason, 2000).In his influential study, Propp treated a corpus of tales in Afanas'ev's collection (Afanas'ev, 1945), establishing basic recurrent units of the plot ('functions'), such as Villainy, Liquidation of misfortune, Reward, or Test of Hero, and the combinations and sequences of elements employed to arrange them into moves.1 His aim was to describe the DNAlike structure of the magic tale sub-genre as a novel way to provide comparisons. As a start along the way to developing a story grammar, the Proppian model is relatively straightforward to formalize for computational semantic annotation, analysis, and generation of fairy tales. Our study describes an effort towards creating a comprehensive XML markup of fairy tales following Propp's functions, by an approach that integrates functional text annotation with grammatical markup in order to be used across text types, genres and languages. The Proppian fairy tale Markup Language (PftML) (Malec, 2001) is an annotation scheme that enables narrative function segmentation, based on hierarchically ordered textual content objects. We propose to extend PftML so that the scheme would additionally rely on linguistic information for the segmentation of texts into Proppian functions. Textual variation is an important phenomenon in folklore, it is thus beneficial to explicitly represent linguistic elements in computational resources that draw on this genre; current international initiatives also actively promote and aim to technically facilitate such integrated and standardized linguistic resources. We describe why and how explicit representation of grammatical phenomena in literary models can provide interdisciplinary benefits for the digital humanities research community. In two related fields of activities, we address the above as part of our ongoing activities in the CLARIN2 and AMICUS3 projects. CLARIN aims to contribute to humanities research by creating and recommending effective workflows using natural language processing tools and digital resources in scenarios where text-based research is conducted by humanities or social sciences scholars. AMICUS is interested in motif identification, in order to gain insight into higher-order correlations of functions and other content units in texts from the cultural heritage and scientific discourse domains. We expect significant synergies from their interaction with the PftML prototype.
[1]
R. Jakobson.
ON RUSSIAN FAIRY TALES
,
1966
.
[2]
R. Jakobson,et al.
Russian Fairy Tales
,
2012
.
[3]
Heda Jason,et al.
Motif, Type and Genre: A Manual for Compilation of Indices & a Bibliography of Indices and Indexing
,
2000
.
[4]
Thierry Declerck,et al.
Integration of Linguistic Markup into Semantic Models of Folk Narratives: The Fairy Tale Use Case
,
2010,
LREC.
[5]
加藤 耕義,et al.
特別インタビュー ハンス=イェルク・ウター教授(ドイツ) The Types of International Folktales『国際昔話カタログ』出版にあたって
,
2005
.
[6]
Nancy Ide,et al.
Representing Linguistic Corpora and Their Annotations
,
2006,
LREC.
[7]
Bill Broyles.
Notes
,
1907,
The Classical Review.
[8]
Lee Haring,et al.
The Types of International Folktales: A Classification and Bibliography. Based on the System of Antti Aarne and Stith Thompson (review)
,
2006
.
[9]
C. Lévi-Strauss.
The Structural Study of Myth
,
1955
.
[10]
William Hansen,et al.
Motif, Type and Genre: A Manual for Compilation of Indices and A Bibliography of Indices and Indexing
,
2002
.
[11]
Vladimir Propp,et al.
Morphology of the folktale
,
1959
.