With the General Data Protection Regulation becoming effective May 25, 2018, organizations (or rather, organisations) seem to be stressing a bit. Most we speak with are asking, “where do we even start?” or “what is included as personal data under the GDPR?”

It is safe to say that these are exactly the questions organizations should be asking, but to know where to start, organizations first need to understand how the GDPR applies to their organization within this new definition for personal data. Without first understanding what to look for, an organization cannot begin to perform data discovery and data mapping exercises, review data management practices and prepare the organization for compliance with the GDPR.

Personal data redefined…sort of.

To start – is personal data redefined by the GDPR? Yes. Is it more encompassing of a definition? Yes. Does it provide a good amount of guidance on interpretation of said definition? In some areas, but not in others.

The Articles of the GDPR open with a list of definitions in Article 4 that provide some guidance on how to digest the remainder of the regulation—the recitals also contain some nuggets of wisdom if you have time to review. Personal data is the very first definition listed under Article 4, hinting that it is most likely pertinent to a comprehensive understanding of the regulation. Article 4(1) states:

‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

In breaking down this definition, there are a few key phrases to focus on. Any information is the big one, as it confirms that personal data, under this regulation, is not limited to a particular group or type of data. Relating to specifies that personal data can encompass any group or type of data, as long as the data is tied to or related to something else. What is that something else? A natural person. A natural person is just that—an actual human being to whom the data applies.

You may have noticed I skipped the ‘an identified or identifiable’ portion of the definition—identified or identifiable means that the natural person has either already been identified, or can readily be identified utilizing other available information. Article 4(1) adds further clarity here, stating that an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

The fact that name, identification number, location data and online identifier are specifically referenced at the beginning of this definition is important, as those pieces of data serve to directly identify an individual. If that specific data is held by the organization, all related data is in scope.

However, if those unique identifiers are not held, your organization should reference the list of other data that could otherwise identify the natural person and bring everything into scope. For example, you may not have John Smith’s name in your database, but you may have salary, company name, and city that that point directly to John Smith when linked together.

In addition to the new definition of personal data, the GDPR also adds some more specificity around what it deems “special categories” of personal data. Article 9 1. states: processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade-union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation shall be prohibited.

This definition is important, as this states that certain personal data falls into a subcategory that has stricter processing requirements. Although the requirement above states that processing of special categories of personal data is prohibited, it is important to note that there are exceptions to this rule. Organizations should reference Article 9 if they believe special categories of data to be in scope.

So how does this definition differ from previous definitions of personal data?

Even though the GDPR “redefines” personal data, is it really all that different from existing definitions? As a baseline, let’s refer to two of the more commonly used definitions for personal data taken from the GDPR’s predecessor—the Data Protection Directive—and NIST 800-122.

The Data Protection Directive defines personal data in Article 2 (a), which states ‘personal data ‘ shall mean any information relating to an identified or identifiable natural person (‘data subject’); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity. This definition is almost identical to that of the GDPR. The main difference is that the GDPR added additional data that can identify an individual, such as name, location data and online identifier. By adding these into the mix, the GDPR is clarifying where individuals are presumed to be identified, helping organizations understand that the data associated with those identifiers is in scope and covered under the regulation.

Special categories of personal data is also defined under the Data Protection Directive. Article 8 1. states Member States shall prohibit the processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life. The GDPR expanded on this definition as well, now including genetic and biometric data, as well as sexual orientation data to be included in special categories. Essentially, the GDPR has taken the definitions for both personal data and special categories from the Data Protection Directive and provided more clarity, while making them more inclusive at the same time.

Most people probably expect the Data Protection Directive and GDPR to have similar definitions, as they are essentially version 1 and 2 of modern day EU data privacy legislation, respectively. However, when compared to the definition of personal data contained in U.S.-based guidance, we start to see some key differences. As the National Institute of Standards and Technology (NIST) is widely accepted, let’s look at their definition of personal data found in their Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) from 2010. NIST 800-122, Section 2.1 states PII is – any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.

In breaking down this NIST definition, we see some similarities, in that the NIST definition starts off just as broadly with the phrasing “any information.” In the same vein, the wording “about an individual” speaks to the clarification provided in the GDPR definition as well. That being said, the definition then goes on to add more specifics regarding information that can identify or be linked to an individual, which is where we start to notice some differences. The identifying pieces of information listed in the NIST definition includes name, social security number, date and place of birth, mother’s maiden name or biometric records. The GDPR is a bit more inclusive in its definition, including name, identification number, location data, and online identifier, which covers most of the items from the NIST definition but also adds the online portion as well. While the GDPR doesn’t include the biometric data in the main definition, it does cover physical and genetic information in the other related information listing.

These differences don’t stop there. The NIST definition does go on to provide guidance on other information that could be linked to the individual, but instead of listing out specific data, the definition focuses rather on sectoral categories of data that seem to be derived from the sectoral privacy laws in the United States. The GDPR definition does not follow this pattern, and instead focuses on the different data that can be linked to an individual from a more generic standpoint, listing out the pieces of information that could be tied to an individual in most industries. Also, while the GDPR definition states that one or more of those other data elements can also identify the individual, the NIST definition really brings that other information into scope by saying it can be personal data as long as the individual is identified—though it does not state that the information can also be used to identify an otherwise unidentified individual.

Final Thoughts

With the GDPR’s becoming effective next year, it’s clear that this new definition of personal data expands on the preexisting EU definition of personal data contained in the Data Protection Directive. Additionally, it adds more specificity to the data that can be used to identify an individual in comparison with leading US personal data definitions.

Why is this so important and relevant to organizations? This new definition of personal data is the most comprehensive definition to date, bringing into scope more information to be considered than any previous definitions in industry regulations or standards. Now, organizations will need to take another look at their previous determination of personal data and reevaluate their data management practices to ensure that the information they hold has been labeled and handled correctly. In fact, information deemed not applicable to past privacy regulations and standards may now become relevant when taking the new definition of personal data into consideration.

Look no further than IP addresses. Most companies wouldn’t normally lump in IP addresses with personal data, but the now-effective GDPR specifically calls out online identifiers in the definition of personal data. The Court of Justice for the European Union (CJEU) issued its judgement indicating as such in Case C-582/14: Patrick Breyer v Bundesrepublik Deutschland, setting precedent that even dynamic IP addresses can be considered personal data in certain situations. Given this new standard, it will be important for organizations to incorporate judgements from recent cases and guidance from the Article 29 Working Party (being replaced by the European Data Protection Board in May of 2018) when determining how the GDPR impacts to their organization and how best to comply.

New procedures and criteria can be confusing, but hopefully the information above has provided some clarity around this new definition of personal data that the GDPR will introduce next year. Basic knowledge of these definitions can be a starting point for determining how the GDPR applies to your organization, and if approached from a comprehensive data and risk management standpoint, this information can help better prepare your organization for compliance with the GDPR and other future privacy regulations and frameworks.

(This post originally appeared on the Cloud Security Alliance blog, which can be viewed here).

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access

Chris Lippert

Chris Lippert

Chris Lippert is a senior associate at Schellman & Company LLC, a financial services advisory firm, and a member of the Cloud Security Alliance.