I Know You Are But What Am I?

Many processors (e.g. SaaS providers) do not have access to personal data, collected by their clients, on their platform. This is usually an intentional restriction as a measure of protecting personal data and restricting liability on the part of the processor. Some processors anonymize data while others pseudonymize (e.g. tokenize) data by removing identifiers. From the perspective of the controller, this is a good measure as the controller does not need to disclose personal data, which they are responsible for, to yet another party and worry about what the recipient of the data will do with it, whether intentionally or not. Of course, the end user, whose data is in question, is better protected and privacy is maintained when their personal data is not accessed by a service provider.

 In this scenario, while the personal data is anonymized or pseudonymized vis-a-vis the processor, it is identifiable to the controller.

 The big and longstanding debate is whether anonymized data processed by one party (i.e. processor) is in fact personal data because it is personal data to the controller. Can the same personal data be labelled differently by different parties? Can data be personal data for one party while being anonymized data for another party?

 A recent case put this debate to rest…for the time being.

 While we would love to save the big reveal to the end of the article, we aren’t that cruel. The CJEU, in the case of C‑319/22, Gesamtverband Autoteile-Handel v. Scania, confirmed that data can be non-personal data for one party, even if the same data is personal data for another party.

While this decision is seemingly great news for many businesses, it’s a good idea to investigate the analysis of this decision and others, and then consider whether relying on anonymized data is a free pass for not complying with privacy legislation.

Labels, Labels, Labels!

What is personal data anyways? This is one of the first questions any company looking to comply with privacy legislation should ask itself. Because if you are not collecting or processing personal data, then privacy legislation does not apply to you.

Unfortunately, this seemingly simple question is anything but simple. The truth is that there aren’t many companies that DON’T collect or process personal data. Most recently, the Office of the Privacy Commissioner of Canada, in its Annual Report to Parliament, made an example of an agronomic company, which is in the business of crop production and agriculture technology. Surprisingly, it was collecting personal data, and more shockingly, some very sensitive personal data. So no industry is in the clear.  

 To make things even more complicated, personal data can be anonymized or pseudonymized (or somewhere on that spectrum) depending on the data. So, ya, confusing!

 Let’s dive a bit deeper.

 According to Article 4(5) of GDPR, pseudonymization is the act of processing personal data in such a manner that it can no longer be attributed to a specific data subject without the use of additional information, which is kept separately. The additional information is subject to technical and organizational measures to ensure that the personal data is not attributed to an identified or identifiable natural person. As a further reference, Recital 26 explains that personal data which has undergone pseudonymization should still be considered personal data. 

 According to Recital 26 GDPR, anonymization is the process whereby personal data is not related to an identified or identifiable natural person and is thus not subject to the GDPR.

 To determine whether data is pseudonymous or anonymous, account should be taken of all means reasonably likely to be used to identify the individual. Objective factors such as cost, the amount of time, and available technology required for re-identification should be considered.  

 Breyer v. Bundesrepublik Deutschland laid out the “reasonable” test to measure the difference between pseudonymous data and anonymous data. The UK ICO reaffirmed this approach in its Code of Practice on Anonymization, which confirmed that a contextual analysis is necessary when determining how to label the data. In evaluating the data, the following test should be applied:

(a)   What is the likelihood that an attempt to identify an individual will be made (based on controls around the data); and

(b)  What is the likelihood that the attempt will be successful (taking into consideration cost, time, technology etc.)?

The above test suggests that a contextual analysis is necessary to make a determination. It further leads to the proposition that data can fall within a continuum whereby on the one end, data is pseudonymous and on the other end data is anonymous.

 Each party undertaking this contextual analysis should consider its own context, rather than other parties that may have access to the data. So depending on which party (i.e. controller, processor, sub-processor) answers the above questions, it is possible that data which is pseudonymized for one party can be anonymous for another party.  

Context Matters…So, It Depends

Lucky for us, a recent ruling came out of the EU, that solidified the idea that every party must determine for itself whether the data it collects or processes is personal data or not, irrespective of what another party labels that same data.

In the recent case of Gesamtverband Autoteile-Handel v. Scania, the main question was about whether a Vehicle Identification Number (VIN) is personal data. When a VIN is created, it is not related to an identifiable individual, however, once an individual purchases the vehicle, the VIN can identify a person. The dispute was about whether the disclosure of the VIN constituted disclosure of personal data since vehicle manufacturers are required to disclose VINs to parties in the market as per an EU regulation.

The CJEU summarized its ruling as follows:

"The Court points out that vehicle identification numbers must be included in the database. That number, taken as such, is not personal in nature. However, it becomes personal data when someone who has access to it has the means to identify the owner of the vehicle, provided that the person concerned is a natural person. The Court notes in that regard that the owner is, like the identification number, indicated in the registration certificate. Even where vehicle identification numbers are to be classified as personal data, the GDPR does not preclude car manufacturers from being obliged to make them available to independent operators."

 In other words, while the VIN is not personal data in and of itself, it can be personal data if additional data is combined with it, and of course that depends on whether you have the additional data available to you, as one of the parties.

Some may draw a comparison to the much-debated topic of whether IP addresses are personal data. Many processes have access to IP addresses, which on their own, are anonymous and lack identifiable information, however, it has long been the case that IP addresses are considered personal data because internet service providers can connect IP addresses to an individual and therefore it is “technically” possible to re-identify an individual (i.e. through a subpoena). When comparing the decision in Scania to the IP address scenario, one would conclude that IP addresses are no longer considered personal data when in the hands of a processor.

What does this mean for businesses?

-       If you were under the impression that you were processing personal data simply because the controller labels the data as personal data, reconsider whether you are processing personal data or anonymized data. In particular, processors should consider whether they can identify an individual with the data they have. If they cannot, irrespective of whether the controller can, the data may still not fall under the personal data category and therefore the GDPR doesn’t apply.

-       If identifiers were removed from the personal data and stored elsewhere, so long as the party does not have access to those identifiers, then the data is anonymized – even if the identifiers are available elsewhere.

While it may seem that many processors are going to have a field day over this ruling, they should be cautious in taking a strong position that the data they process is anonymized and therefore privacy legislation does not apply to them. First, the processor will need to demonstrate that the data is anonymized, which can be difficult, especially when it likely falls somewhere on a spectrum. With a contextual “reasonable” test, there is lots of room for interpretation and subjectivity, and it is possible that the regulators may not agree with your analysis of the data. Businesses should take a risk-based approach when deciding whether to comply with privacy legislation or not. Processors should also consider the industry they are in. For example, if you are a processor in a heavily regulated industry, like the financial industry, the controller will insist the processor comply with privacy legislation as part of their due diligence, as they will not want to accept the risk-based approach the processor took to get to the decision the data is anonymous. Irrespective, the processor will still need to comply.

For processors doing their analysis and come to the conclusion their data is anonymous, it is highly recommended that they engage in a staged re-identification attack every few years to determine the probability of re-identification.

Finally, processors who perceive their data as anonymous and do not comply with privacy regulations, when in fact the data is pseudonymous, may face hefty fines for non-compliance, including class-action lawsuits, loss of business, and reputational harm.

 

 

Previous
Previous

Integrative Thinking - The Cross-Pollination of Privacy and Security

Next
Next

The Challenge of Public Expectation