What Is Not Personal Data? Drawing the Line Under GDPR

Knowing what is not personal data is just as important as understanding what is. If you misclassify non-personal data as personal data, you impose unnecessary compliance overhead on your business. If you misclassify personal data as non-personal, you risk fines, enforcement actions, and loss of user trust.

This guide explains where the GDPR and other privacy laws draw the line, with concrete examples on both sides. This is educational content, not legal advice. Consult a qualified attorney for guidance specific to your situation.

Defining Personal Data Before You Can Exclude It

Before you can determine what is not personal data, you need a precise understanding of what personal data is. Article 4(1) of the GDPR defines personal data as:

any information relating to an identified or identifiable natural person ('data subject')

An identifiable person is one who can be identified, directly or indirectly, by reference to an identifier such as a name, an identification number, location data, an online identifier, or one or more factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that person.

Three elements matter in this definition:

"Any information": The format is irrelevant. Text, numbers, images, audio, biometric templates, and IP addresses all qualify if they relate to an identifiable person.
"Relating to": The data must have a connection to the individual, whether by content, purpose, or result.
"Identifiable": Someone, somewhere, using reasonably available means, could link the data to a specific living person.

If all three conditions are absent, the data is not personal data.

Categories of Data That Are Not Personal Data

Understanding what is not personal data requires looking at specific categories. The following types generally fall outside the scope of the GDPR and similar privacy laws.

Truly Anonymized Data

Recital 26 of the GDPR explicitly excludes anonymous information from its scope. Anonymous information means data that does not relate to an identified or identifiable natural person, or data that has been rendered anonymous in such a way that the data subject is not or is no longer identifiable.

The key word is "irreversible." If the anonymization process can be reversed, or if re-identification is reasonably likely considering the time, cost, and technology available, the data is still personal data. Pseudonymized data, where identifiers are replaced by tokens but a key exists to re-link them, remains personal data under Article 4(5).

Examples of truly anonymized data:

A dataset showing that 34% of users in a country prefer a certain product category, with no way to trace any percentage back to individual users
Aggregate website traffic statistics (e.g., 50,000 page views last month) stripped of all user-level identifiers
Survey results published as summary statistics with responses from a large enough pool that individuals cannot be singled out

Aggregated Statistics

When individual data points are combined into group-level metrics, the result is often non-personal. Monthly revenue figures, average session duration across all visitors, or total units sold per region do not identify anyone.

However, aggregation can fail to anonymize if the group is small enough. If a report shows "average salary of employees in department X" and department X has two people, the data is effectively personal. The European Data Protection Board (EDPB) has warned that small-group aggregation does not qualify as anonymization.

Company and Organizational Data

Data about legal entities, not natural persons, is not personal data under the GDPR. Recital 14 states that the regulation does not cover the processing of personal data relating to legal persons.

Examples:

Company registration numbers
Business tax identification numbers
Corporate financial statements
Company email addresses that use a role-based format (e.g., [email protected], [email protected])
Business addresses

One important caveat: data about sole proprietors or partnerships where the business identity is inseparable from a natural person can still be personal data. A sole trader's business email that contains their personal name ([email protected]) may qualify as personal data.

Data About Deceased Persons

The GDPR applies to living natural persons only. Recital 27 states that the regulation does not apply to the personal data of deceased persons, though it allows Member States to provide rules for processing such data. Some countries, like France and Italy, have enacted specific protections for deceased individuals' data.

Purely Technical, Non-Identifying Data

Certain technical data points that cannot be linked to a person are not personal data:

Server error codes (e.g., 404, 500)
Application version numbers
Hardware specifications reported in aggregate (e.g., "60% of sessions used devices with 8 GB RAM")
Network latency measurements between servers

Be careful here. Technical data that includes IP addresses, device fingerprints, or user-agent strings combined with timestamps can become personal data because these combinations can identify individuals.

Publicly Available Non-Personal Information

Data that is publicly available and does not relate to an identifiable individual is not personal data:

Weather data
Stock prices
Government-published statistics at a population level
Geographic or geological data
Scientific measurements (temperature, air quality indices)

The Gray Zone: Data That Looks Non-Personal but Is Not

The most common compliance mistakes happen in the gray zone. Data that seems harmless can become your personal data obligation when context changes.

IP Addresses

The Court of Justice of the European Union (CJEU) ruled in the Breyer case (C-582/14, 2016) that even dynamic IP addresses can be personal data when the entity processing them has lawful means to obtain additional information linking the IP to a person. For most website operators, IP addresses are personal data.

Device Fingerprints

A combination of browser type, screen resolution, installed fonts, and timezone does not look like personal data in isolation. But research has shown that browser fingerprints can uniquely identify over 90% of users. When your website collects enough device attributes to distinguish one visitor from another, you are processing personal data.

Cookies that assign a unique identifier to a browser are personal data under the GDPR. Recital 30 explicitly states that online identifiers such as cookie identifiers and internet protocol addresses can be used to create profiles of natural persons and identify them. If you use cookies on your website, your cookie policy generator and privacy notice must account for this.

Location Data

City-level location data from IP geolocation might not identify a person in a large city. But the same data point in a small town, combined with a timestamp, could narrow identification to a handful of people. Context determines classification.

Pseudonymized Data

Pseudonymization replaces direct identifiers with tokens, but the original data can be re-linked using a key. Article 4(5) of the GDPR is explicit: pseudonymized data is still personal data. Only irreversible anonymization removes data from the GDPR's scope.

How Other Privacy Laws Define the Boundary

The GDPR is not the only law that distinguishes personal from non-personal data. Other jurisdictions draw similar but not identical lines.

CCPA / CPRA (California)

The CCPA defines personal information as information that identifies, relates to, describes, is reasonably capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household. It explicitly excludes:

Publicly available information from government records
De-identified data that cannot reasonably be linked to a consumer
Aggregate consumer information

Penalties for mishandling personal information range from $2,500 per unintentional violation to $7,500 per intentional violation.

LGPD (Brazil)

Brazil's Lei Geral de Protecao de Dados defines personal data as information related to an identified or identifiable natural person. Anonymized data is excluded unless the anonymization process can be reversed.

Create a comprehensive privacy policy for your website or app. Create yours in minutes with TermsBox.

Generate Now

PIPEDA (Canada)

Canada's Personal Information Protection and Electronic Documents Act defines personal information as information about an identifiable individual. Business contact information used for commercial communications (name, title, business address, business phone) is explicitly excluded.

Practical Tests: Is This Data Personal or Not?

When you are unsure whether a dataset qualifies as personal data, apply these three tests:

The Identification Test

Ask: can this data, alone or combined with other information I hold or could reasonably obtain, identify a specific living person?

If yes, it is personal data.
If no, move to the next test.

The Singling-Out Test

Ask: can this data be used to single out one person from a group, even without knowing their name?

The EDPB uses three criteria for anonymization: is it still possible to single out an individual, to link records relating to the same individual, or to infer information about an individual? If any answer is yes, the data has not been anonymized.

The Mosaic Test

Ask: if someone combined this data with other reasonably available datasets, could they identify an individual?

Even if your data alone is non-identifying, you must consider what a motivated party could do with publicly accessible information. Court records, social media profiles, electoral rolls, and commercial data brokers all provide linking material.

If the data passes all three tests, it is likely not personal data. Document your reasoning, because regulators may ask.

Best Practices for Handling the Boundary

Whether you are drafting a privacy policy generator document or deciding which data your analytics should collect, these practices reduce risk.

Classify Before You Collect

Build a data inventory that categorizes every data element as personal, potentially personal, or non-personal. Review the classification quarterly and after any change to data sources or processing activities.

Anonymize Properly

The Article 29 Working Party (now EDPB) published Opinion 05/2014 on anonymization techniques. It identifies three main approaches:

Randomization: Adding noise to data values so individual records cannot be isolated. Differential privacy is a modern example.
Generalization: Reducing the precision of data (e.g., replacing exact age with an age range, or exact location with a region).
Suppression: Removing data fields entirely when they are not needed.

No single technique guarantees anonymization. Combine methods and test against re-identification attacks.

Document Your Reasoning

If you treat data as non-personal, write down why. Record the tests you applied, the datasets you considered, and the conclusion you reached. This documentation protects you if a supervisory authority challenges your classification.

Reassess When Context Changes

Data classified as non-personal today may become personal tomorrow. A new publicly available dataset, a change in technology, or a business acquisition can shift the balance. Build reassessment triggers into your data governance process.

Use a Privacy Policy Regardless

Even if you believe you collect only non-personal data, publishing a privacy policy generator document is a transparency best practice. Most websites set cookies, use analytics, or embed third-party scripts that collect personal data, often without the operator realizing it. Compliance tools like TermsBox can scan your website to detect exactly what data your site collects, removing the guesswork.

Common Misconceptions About Non-Personal Data

Several widely held beliefs about what is not personal data are incorrect. Clearing these up prevents costly mistakes.

"Hashed email addresses are not personal data." Hashing is a pseudonymization technique, not anonymization. A hashed email can be re-identified by hashing known email addresses and comparing the outputs. Hashed emails remain personal data.

"If we delete names, the data is anonymous." Names are just one identifier. Combinations of date of birth, postal code, and gender can uniquely identify 87% of the U.S. population, according to research by Latanya Sweeney. Removing names alone is not enough.

"Metadata is not personal data." Email metadata (sender, recipient, timestamp, subject line), phone call metadata (caller, recipient, time, duration), and browsing metadata (URLs visited, timestamps) can all reveal personal information and identify individuals.

"Encrypted data is not personal data." Encryption protects data in transit and at rest, but the underlying data is still personal data because decryption is possible with the key. Encryption is a security measure, not an anonymization technique.

"Business data is never personal data." While company data is generally outside the GDPR's scope, data about employees, directors, or sole proprietors acting in a business capacity may still be personal data. A sales CRM full of individual contact names, direct phone numbers, and personal email addresses contains personal data regardless of the business context.

Frequently Asked Questions

What is not considered personal data under GDPR?

Data that cannot identify a living individual, either on its own or when combined with other available information, is not personal data. Common examples include truly anonymized datasets, aggregated statistics, company registration numbers, and weather data.

Is anonymized data personal data?

No, provided the anonymization is irreversible. Recital 26 of the GDPR states that data rendered truly anonymous, so that the individual is no longer identifiable, falls outside the regulation. However, if re-identification is reasonably possible, the data remains personal data.

Can non-personal data become personal data?

Yes. Data that appears non-personal in isolation can become personal data when combined with other datasets that make identification possible. A city-level location is not personal data on its own, but paired with a date, time, and device type, it might identify someone.

Do I still need a privacy policy if I only collect non-personal data?

If you genuinely collect zero personal data, the GDPR does not require a privacy policy. In practice, most websites set cookies or log IP addresses, both of which qualify as personal data. A privacy policy is still a best practice for transparency, and tools like a privacy policy generator make creating one straightforward.

Defining Personal Data Before You Can Exclude It

Before you can determine what is not personal data, you need a precise understanding of what personal data is. Article 4(1) of the GDPR defines personal data as:

any information relating to an identified or identifiable natural person ('data subject')

Three elements matter in this definition:

"Any information": The format is irrelevant. Text, numbers, images, audio, biometric templates, and IP addresses all qualify if they relate to an identifiable person.
"Relating to": The data must have a connection to the individual, whether by content, purpose, or result.
"Identifiable": Someone, somewhere, using reasonably available means, could link the data to a specific living person.

If all three conditions are absent, the data is not personal data.

Categories of Data That Are Not Personal Data

Understanding what is not personal data requires looking at specific categories. The following types generally fall outside the scope of the GDPR and similar privacy laws.

Truly Anonymized Data

Examples of truly anonymized data:

A dataset showing that 34% of users in a country prefer a certain product category, with no way to trace any percentage back to individual users
Aggregate website traffic statistics (e.g., 50,000 page views last month) stripped of all user-level identifiers
Survey results published as summary statistics with responses from a large enough pool that individuals cannot be singled out

Aggregated Statistics

Company and Organizational Data

Data about legal entities, not natural persons, is not personal data under the GDPR. Recital 14 states that the regulation does not cover the processing of personal data relating to legal persons.

Examples:

Company registration numbers
Business tax identification numbers
Corporate financial statements
Company email addresses that use a role-based format (e.g., [email protected], [email protected])
Business addresses

Data About Deceased Persons

Purely Technical, Non-Identifying Data

Certain technical data points that cannot be linked to a person are not personal data:

Server error codes (e.g., 404, 500)
Application version numbers
Hardware specifications reported in aggregate (e.g., "60% of sessions used devices with 8 GB RAM")
Network latency measurements between servers

Publicly Available Non-Personal Information

Data that is publicly available and does not relate to an identifiable individual is not personal data:

Weather data
Stock prices
Government-published statistics at a population level
Geographic or geological data
Scientific measurements (temperature, air quality indices)

The Gray Zone: Data That Looks Non-Personal but Is Not

The most common compliance mistakes happen in the gray zone. Data that seems harmless can become your personal data obligation when context changes.

IP Addresses

Device Fingerprints

Location Data

Pseudonymized Data

How Other Privacy Laws Define the Boundary

The GDPR is not the only law that distinguishes personal from non-personal data. Other jurisdictions draw similar but not identical lines.

CCPA / CPRA (California)

Publicly available information from government records
De-identified data that cannot reasonably be linked to a consumer
Aggregate consumer information

Penalties for mishandling personal information range from $2,500 per unintentional violation to $7,500 per intentional violation.

LGPD (Brazil)

Create a comprehensive privacy policy for your website or app. Create yours in minutes with TermsBox.

Generate Now

PIPEDA (Canada)

Practical Tests: Is This Data Personal or Not?

When you are unsure whether a dataset qualifies as personal data, apply these three tests:

The Identification Test

Ask: can this data, alone or combined with other information I hold or could reasonably obtain, identify a specific living person?

If yes, it is personal data.
If no, move to the next test.

The Singling-Out Test

Ask: can this data be used to single out one person from a group, even without knowing their name?

The Mosaic Test

Ask: if someone combined this data with other reasonably available datasets, could they identify an individual?

If the data passes all three tests, it is likely not personal data. Document your reasoning, because regulators may ask.

Best Practices for Handling the Boundary

Whether you are drafting a privacy policy generator document or deciding which data your analytics should collect, these practices reduce risk.

Classify Before You Collect

Anonymize Properly

The Article 29 Working Party (now EDPB) published Opinion 05/2014 on anonymization techniques. It identifies three main approaches:

Randomization: Adding noise to data values so individual records cannot be isolated. Differential privacy is a modern example.
Generalization: Reducing the precision of data (e.g., replacing exact age with an age range, or exact location with a region).
Suppression: Removing data fields entirely when they are not needed.

No single technique guarantees anonymization. Combine methods and test against re-identification attacks.

Document Your Reasoning

Reassess When Context Changes

Use a Privacy Policy Regardless

Common Misconceptions About Non-Personal Data

Several widely held beliefs about what is not personal data are incorrect. Clearing these up prevents costly mistakes.

Frequently Asked Questions

What is not considered personal data under GDPR?

Is anonymized data personal data?

Can non-personal data become personal data?

Do I still need a privacy policy if I only collect non-personal data?

Defining Personal Data Before You Can Exclude It

Categories of Data That Are Not Personal Data

Truly Anonymized Data

Aggregated Statistics

Company and Organizational Data

Data About Deceased Persons

Purely Technical, Non-Identifying Data

Publicly Available Non-Personal Information

The Gray Zone: Data That Looks Non-Personal but Is Not

IP Addresses

Device Fingerprints

Cookie Identifiers

Location Data

Pseudonymized Data

How Other Privacy Laws Define the Boundary

CCPA / CPRA (California)

LGPD (Brazil)

PIPEDA (Canada)

Practical Tests: Is This Data Personal or Not?

The Identification Test

The Singling-Out Test

The Mosaic Test

Best Practices for Handling the Boundary

Classify Before You Collect

Anonymize Properly

Document Your Reasoning

Reassess When Context Changes

Use a Privacy Policy Regardless

Common Misconceptions About Non-Personal Data

Frequently Asked Questions

Defining Personal Data Before You Can Exclude It

Categories of Data That Are Not Personal Data

Truly Anonymized Data

Aggregated Statistics

Company and Organizational Data

Data About Deceased Persons

Purely Technical, Non-Identifying Data

Publicly Available Non-Personal Information

The Gray Zone: Data That Looks Non-Personal but Is Not

IP Addresses

Device Fingerprints

Cookie Identifiers

Location Data

Pseudonymized Data

How Other Privacy Laws Define the Boundary

CCPA / CPRA (California)

LGPD (Brazil)

PIPEDA (Canada)

Practical Tests: Is This Data Personal or Not?

The Identification Test

The Singling-Out Test

The Mosaic Test

Best Practices for Handling the Boundary

Classify Before You Collect

Anonymize Properly

Document Your Reasoning

Reassess When Context Changes

Use a Privacy Policy Regardless

Common Misconceptions About Non-Personal Data

Frequently Asked Questions