CLARIN: Distributed Language Resources and Technology in a European Infrastructure

CLARIN is a European Research Infrastructure providing access to digital language resources and tools from across Europe and beyond to researchers in the humanities and social sciences. This paper focuses on CLARIN as a platform for the sharing of language resources. It zooms in on the service offer for the aggregation of language repositories and the value proposition for a number of communities that benefit from the enhanced visibility of their data and services as a result of integration in CLARIN. The enhanced findability of language resources is serving the social sciences and humanities (SSH) community at large and supports research communities that aim to collaborate based on virtual collections for a specific domain. The paper also addresses the wider landscape of service platforms based on language technologies which has the potential of becoming a powerful set of interoperable facilities to a variety of communities of use.

[1]  Victoria Arranz,et al.  Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid , 2020, LREC.

[2]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[3]  Dieter Van Uytvanck,et al.  CLARIN: Towards FAIR and Responsible Data Science Using Language Resources , 2018, LREC.

[4]  Stelios Piperidis The META-SHARE Language Resources Sharing Infrastructure: Principles, Challenges, Solutions , 2012, LREC.

[5]  Thierry Declerck,et al.  The META-SHARE Metadata Schema for the Description of Language Resources , 2012, LREC.

[6]  Stelios Piperidis,et al.  Managing Public Sector Data for Multilingual Applications Development , 2018, LREC.

[7]  Katarzyna Klessa,et al.  Corpora of Disordered Speech in the Light of the GDPR: Two Use Cases from the DELAD Initiative , 2020, LREC.

[8]  Caroline F. Rowland,et al.  The CLARIN Knowledge Centre for Atypical Communication Expertise , 2020, LREC.

[9]  Tomaz Erjavec,et al.  CLARIN's Key Resource Families , 2018, LREC.

[10]  Claus Zinn,et al.  The CLARIN Language Resource Switchboard , 2016 .

[11]  Josef van Genabith,et al.  The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe , 2020, LREC.

[12]  Stelios Piperidis,et al.  Combining and Extending Data Infrastructures with Linguistic Annotation Services , 2015, WLSI.