BLOCKCHAIN TECHNOLOGY AND DATA PROTECTION – A LOVE-HATE RELATIONSHIP

At first glance, blockchain technology (decentralized technologies) seems to be the perfect tool for better privacy protection. This technology could allow for new data governance models, and finally give natural persons (data subjects) real control over their personal data. Instead of a central entity controlling their users’ personal data in a non-transparent system, a blockchain can enable direct and transparent data sharing under the data subject’s own control. There are already a few projects that want to use blockchain for their data governance systems.

However, with Libra and Facebook’s bad reputation with regard to privacy, the focus has shifted somewhat, and the people, the politicians and also the data protection supervisory authorities have begun to question the compatibility of blockchain technology with the current data protection laws.

To this day, the debate as to whether a blockchain can be reconciled with the GDPR still continues. At the moment, there is no regulatory clarity, because only some of the national data protection authorities have voiced their opinions, but to date there is neither an European Data Protection Board (EDPB) guideline nor any court decision to rely on. So, the only way is to approach it from a risk perspective, and try to reduce the data protection risk by using data protection enhancement techniques as much as possible.

The main problems are caused by the following confrontations between GDPR and blockchain technology:

nobody owns and controls the system vs. a controller or joint controller is responsible for compliance with the GDPR
Public keys vs. anonymous data
Full decentralizations and multiple data storage vs. data minimization
Immutability vs. the right to erasure and rectification

Who is the controller?

Even though the GDPR was designed in a technically natural fashion, the concept of decentralization – as with most regulations – does not fit within the framework thereof. The GDPR is based on the notion that there is a natural person or a legal entity who will control the data, and to whom data subjects can turn and demand their rights, according to the GPDR. Therefore, as the GDPR is concerned about a fully decentralized – public and permissionless – blockchain, it must be possible to identify who is controlling it. If a privately permissioned blockchain is used, as recommended by the Comission Nationale Informatique & Libertés France (CNIL), the consortium should address this issue and identify the controller or joint controllers in advance. The use of public and permissionless blockchain makes it much harder to determine who the controller is. At the moment, almost every participant can be regarded as a controller, if he can define the purposes and the means of the processing. Even a user could become a controller if he facilitates transactions on behalf of a data subject.

In summary, the following players could become controllers (joint controllers):

Developers
Players who control the blockchain or the nodes
Wallet providers
Publishers of smart contracts
Network users
Miners as data processors

In recent decisions (Facebook fanpage, Jehovah’s Witnesses, Facebook like button), the Court of Justice of the European Union (CJEU) has shown an expansive approach towards joint controllership. Hence, we can assume that it would take an expansive stand on joint controllership with regard to a blockchain ecosystem as well, even if the blockchain was public, and permissionless creators / developers were to address this issue and ensure that every potential controller could prove compliance in the best way possible (accountability principle).

Is data on the blockchain personal data?

In most cases, a user (data subject) is only linked to the blockchain via the public key, which brings up the question of whether a public key (a numeric hash) is enough to represent personal data. According to Article 4 (1) GDPR, personal data includes any information relating to an identified or identifiable data subject). In Recital 26 of the GDPR, it is specified that for any personal data which has undergone pseudonymization, and which could be attributed to a natural person in combination with additional information, shall be considered to be information belonging to an identifiable natural person. Secondly, to determine whether a data subject is identifiable, “all the means reasonably likely to be used”, should be taken into account, such as singling out, either by the controller or by another person, in order to identify the natural person, either directly or indirectly. To ascertain whether any means are reasonably likely to be used to identify the data subject, all of the objective factors should be taken into account, such as the costs and the amount of time required for identification, taking the technological developments and the available technology at the time of the processing into consideration.

As a result, the distinction between personal data and anonymous data is not set in stone. It always requires a case-by-case assessment. The concept of personal data is not only very broad, but the definition of the phrase, “means reasonably likely to be used” to identify an individual, is also vague, and it often causes confusion and uncertainty among controllers. It is no surprise that the GDPR does not describe an assessment process or a threshold for “reasonably likely,” given the increase in the computational power, the technological developments available and the reduction in the effort and costs involved in identification. Hence, state-of-the-art technology needs to be considered.

Furthermore, Recital 30 of the GDPR mentions that a data subject “may be associated with an identifier provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.”

In the “Breyer” case, the ECJ held that a numeric identifier, like an IP address, is enough to constitute the processing of personal data, if the controller or a third party has the means to identify the person.

Often, a public key can fulfill the characteristics of a numeric identifier, because a third party (e.g. wallet provider, exchange, etc.) is often able to identify the person. Therefore, a public key will most likely not be regarded as anonymous data. Public keys are pseudonymous data, and therefore, they must be treated as personal data, according to the GDPR.

Besides public keys, other so-called transactional data may be used and stored on blockchain. Academics (Husam Al Jawaheri et. al or Sarah Meiklejohn et. al.) have shown that transaction data can already be enough to identify a person behind the transaction or/and a public key. In some cases, it is even possible to ascertain an IP address.

There are some anonymization techniques that can be applied for blockchain. For example, stealth addresses – like those used by Monero, the use of one-time accounts for transactions, homomorphic encryption or the addition of noise. These techniques might solve the privacy issues, but it creates problems with other regulations. Financial transactions, in particular, have to be monitored by the service provider. Financial service providers have to comply with strict anti-money-laundering rules. The Financial Action Task Force (FATF) stipulates that virtual asset transactions are not exempt from this. On 21 June, 2019 the FATF issued a new guidance on financial services in the context of blockchain technology, and made it clear that information about the client and the beneficiary must be transmitted via the transfer of tokens.

Is data minimization even possible?

The data minimization principle according to Article 5 (1c) requires the limitation of data processing to what is necessary in relation to the purposes for which it is processed. As storage is also considered as processing, the decentralized and multiple storage conflicts with the principle. Yet, there is no legal certainty that multiple storage will infringe on this principle. In public, permissionless blockchain, decentralized storage poses an additional risk, as data could potentially be transferred outside of the EU to a third country. In contrast to standard contractual private and permissioned blockchain clauses, binding corporate rules, codes of conduct or even certification mechanisms cannot be implemented into a public blockchain.

How to ensure the rights of the data subject?

First of all, as a controller you have to take appropriate measures to provide the required information according to the GDPR. As mentioned above, it is still not clear who the controller in a public and permissionless blockchain is, and therefore, most participants are hesitant to provide such information, as it could be interpreted as an acknowledgment of controllership. By contrast, in a permissioned blockchain, the consortium is obliged to provide such information, and therefore should discuss controllership ahead of times.

The right of access is the least problematic requirement, as a data subject often already has access to the ledger.

On the other hand, the right of rectification (Article 16 GDPR) and erasure (Art. 17 GDPR) is still the biggest issue and the most debated topic, as it is diametrically opposed to the immutability of a blockchain, which is one of the core elements of a blockchain. If you allow a third party (controller) to change or erase entries on a blockchain, the core value proposition no longer exists. As the GDPR is still a new regulation, neither do we have any legal certainty as to how the notion of erasure in Art. 17 GDPR ought to be interpreted. One the one hand, the CNIL considers that this data cannot be further minimized and that the retention periods are, in essence, in line with the blockchain’s duration of existence. On the other hand, some are also working on editable blockchain, but then the blockchain would lose its main value position, and it might be better to use other forms of data bases instead.

To reduce the risk, personal data should be kept off chain whenever possible, but at the moment, we lack legal certainty regarding rectification (Article 16 GDPR) and erasure (Art. 17 GDPR).

Questions you should ask yourself when using blockchain technology

How much personal data does my business case require?
Is the use of blockchain technology the best solution?
Do I use a privately permissioned blockchain or a public permissionless blockchain?
Can I store personal data off chain?
If personal data is stored on a blockchain, what privacy enhancing techniques (data obfuscation, encryption, aggregation techniques, etc.) can be applied?
Is privacy applied by means of design and default techniques?
Have I done a Data Protection Impact Assessment?

You need assistance in assessing and reducing your privacy and data protection risk?

Further Resources

Comission Nationale Informatique & Libertés France (CNIL), Blockchain, Solutions for a responsible use of the blockchain in the context of personal data.
The European Union blockchain observatory and forum, Blockchain and the GDPR.
EPRS | European Parliamentary Research Service, Blockchain and the General Data Protection – Can distributed ledgers be squared with European data protection law?