Anonymous Data v. Personal Data — A False Debate: An EU Perspective on Anonymization, Pseudonymization and Personal Data

This era of big data analytics promises many things. In particular, it offers opportunities to extract hidden value from unstructured raw datasets through novel reuse. The reuse of personal data is, however, a key concern for data protection law as it involves processing for purposes beyond those that justified its original collection, at odds with the principle of purpose limitation. The issue becomes one of balancing the private interests of individuals and realizing the promise of big data. One way to resolve this issue is to transform personal data that will be shared for further processing into “anonymous information” to use an EU legal term. “Anonymous information” is outside the scope of EU data protection laws, and is also carved out from privacy laws in many other jurisdictions worldwide. The foregoing solution works well in theory, but only as long as the output potential from the data still retains utility, which is not necessarily the case in practice. This leaves those in charge of processing the data with a problem: how to ensure that anonymisation is conducted effectively on the data in their possession, while retaining its utility for potential future disclosure to, and further processing by, third parties? Despite broad consensus around the need for effective anonymisation techniques, the debate as to when data can be said to be legally anonymized to satisfy EU data protection laws is long-standing. Part of the complexity in reaching consensus derives from confusion around terminology, in particular the meaning of the concept of anonymisation in this context, and how strictly delineated that concept should be. This can be explained, in turn, by a lack of consensus on the doctrinal theory that should underpin its traditional conceptualization as a privacy-protecting mechanism. Yet, the texts of both the existing EU Data Protection Directive (DPD) and the new EU General Data Protection Regulation (GDPR) are ambiguous. This paper suggests that, although the concept of anonymisation is crucial to demarcate the scope of data protection laws at least from a descriptive standpoint, recent attempts to clarify the terms of the dichotomy between “anonymous information” and personal data (in particular, by EU data protection regulators) have partly failed. Although this failure could be attributed to the very use of a terminology that creates the illusion of a definitive and permanent contour that clearly delineates the scope of data protection laws, the reasons are slightly more complex. Essentially, failure can be explained by the implicit adoption of a static approach, which tends to assume that once the data is anonymized, not only can the initial data controller forget about it, but also that recipients of the transformed dataset are thereafter free from any obligations or duties because it always lies outside the scope of data protection laws. By contrast, the state of anonymized data has to be comprehended in context, which includes an assessment of the data, the infrastructure, and the agents. Moreover, the state of anonymized data should be comprehended dynamically: anonymized data can become personal data again, depending upon the purpose of the further processing and future data linkages, implying that recipients of anonymised data have to behave responsibly. The paper starts by examining recent approaches to anonymisation, highlighting their shortcomings. It then explains why a dynamic approach to anonymisation is both more appropriate and compatible with the DPD and the GDPR. Ultimately, we conclude that the opposition between so-called “anonymous information” and personal data in a legal sense is less radical than usually described.