co-authors for the article by Stemle et al. (2019) and are re-used with the permission of the LCR volume editors Article 42, “personal data means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person...” (Commission, 2016, art.4). Consider Figure 1, where adding up information from the two sources – a learner text and sociodemographic metadata – can give away a learner. Even though the name as such is not revealed to the data users, indirect clues can be used to identify a person. SOCIO-DEMOGRAPHIC METADATA • L1: Luxembourgian, Chinese • Year of birth: 1986
[1]
Khaled El Emam,et al.
Anonymizing Health Data: Case Studies and Methods to Get You Started
,
2013
.
[2]
Anna-Sara Lind.
General Data Protection Regulation – final result
,
2016
.
[3]
Kar n Fort,et al.
Collaborative Annotation for Reliable Natural Language Processing: Technical and Sociological Aspects
,
2016
.
[4]
Walt Detmar Meurers,et al.
The MERLIN corpus: Learner language and the CEFR
,
2014,
LREC.
[5]
Paul Meurer,et al.
The ASK Corpus - a Language Learner Corpus of Norwegian as a Second Language
,
2006,
LREC.