Challenges in Enabling Mixed Media Scholarly Research with Multi-media Data in a Sustainable Infrastructure

Big-scale infrastructure projects in the humanities and social sciences such as the Digital Research Infrastructure for the Arts and Humanities (DARIAH) (Edmond et al., 2017), or the Common Language Resources and Technology Infrastructure (CLARIN) (Hinrichs and Krauwer, 2014) aim to provide solutions for both preservation and access to collections and data necessary for scholarly research (Zundert, 2012). Some infrastructure projects build decentralized “atomic” software services, e.g., as in the LLS infrastructure project (Buchler et al., 2016), while others prefer to build more centralized virtual research environments, as in the European Holocaust Research Infrastructure (EHRI) (Lauer, 2014). Also, even within a single infrastructure project, these two models can coexist. This is the case of the CLARIAH infrastructure, where different approaches have been taken to date for serving different user groups, i.e., several specialized tools for linguists (Odijk, Broeder & Barbiers, 2015), or a research environment (the Media Suite) that serves the scholarly needs for working with audiovisual data collections and related mixed-media contextual sources that are maintained at cultural heritage and knowledge institutions. This paper discusses the rationale and challenges behind the development of the Media Suite.

[1]  Joris van Zundert,et al.  If You Build It, Will We Come? Large Scale Digital Infrastructures as a Dead End for Digital Humanities. , 2012 .

[2]  Franciska de Jong,et al.  Audio-visual Collections and the User Needs of Scholars in the Humanities: a Case for Co-Development. , 2011 .

[3]  Deb Roy,et al.  Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Loretta Auvil,et al.  Mapping mutable genres in structurally complex volumes , 2013, 2013 IEEE International Conference on Big Data.

[6]  Lora Aroyo,et al.  Enriching Media Collections for Event-Based Exploration , 2017, MTSR.

[7]  Thomas Eckart,et al.  Mining and analysing one billion requests to linguistic services , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[8]  Jennifer Edmond,et al.  The DARIAH ERIC: Redefining Research Infrastructure for the Arts and Humanities in the Digital Age , 2017, ERCIM News.

[9]  Christopher M. Danforth,et al.  The emotional arcs of stories are dominated by six basic shapes , 2016, EPJ Data Science.

[10]  Mark Hedges,et al.  Scholarly primitives: Building institutional infrastructure for humanities e-Science , 2013, Future Gener. Comput. Syst..

[11]  Andrew Piper Novel Devotions: Conversional Reading, Computational Modeling, and the Modern Novel , 2015 .

[12]  Marijn Koolen,et al.  Audiovisual media annotation using Qualitative Data Analysis Software: a comparative analysis , 2018 .

[13]  Benjamin M. Schmidt Plot arceology: A vector-space model of narrative structure , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[14]  Toby Burrows,et al.  Aggregating Cultural Heritage Data for Research Use: The Humanities Networked Infrastructure (HuNI) , 2015, MTSR.