Data Retention: define and operationalise Every organization knows that they should not keep data for longer than is "necessary", but determining what this means in practice can be very challenging.
It’s been a longstanding principle of European data privacy law that data should be held for “no longer than is necessary”. The GDPR does not specify exact data retention timescales, because it is context-specific. The problem is that this assumes that each data element is collected only for a single purpose and that this purpose was immediately apparent at the outset.
Let’s take a simple example: say you operate an app or website and collect someone’s data for order fulfillment purposes. Someone places an order, you fulfill the transaction, and deliver the services/goods. Has the ‘necessity of the data processing been exhausted, meaning that the data should be deleted? What if sometime later that individual were to dispute the transaction - perhaps claiming you overcharged, dispatched the wrong item, or delivered fewer than the number of goods ordered? If that happens, you will likely need the data to respond to the dispute and, potentially, defend any subsequent litigation. Each of these scenarios presents a potential ‘necessity’ to retain the data. Each of them has different retention time periods associated with them - ranging from the very short (delivery of the goods) to the very long (statutory limitation periods for litigation).
The issue then becomes further complicated if you operate internationally. Imagine that the business decided to retain data for as long as is “necessary” to defend potential future legal claims - this means retaining the data for the duration of the relevant statutory limitation period. But statutory limitation periods are set by national laws - meaning that if you do business across multiple jurisdictions, you need to understand the different statutory limitation periods in each of those countries and set country-specific retention periods accordingly.
Instead, creating a comprehensive data retention program Instead, creating a comprehensive data retention program requires an intimate understanding of both the data and its uses and the relevant laws, regulations, and risks that affect the business and may mandate specific retention periods.
- Recognise that you need a data retention program. Some organizations, faced with the complexity of establishing a data retention program, may choose simply to ignore or postpone the problem. That’s not a good response, and holding on to your data indefinitely would be inconsistent with the principle of data minimisation and purpose limitation.
- Start. Don’t let the great be the enemy of the good. Any data retention program, even if imperfect, is better than no data retention program. You can think of possible approaches to data retention as sitting on a spectrum - with indefinite data retention (not good) sitting on one end of the spectrum; and a perfect, comprehensive, granular program sitting at the other end of the spectrum.
- Solve. If you need to retain some data for particularly long periods of time (e.g. product improvement or machine learning), then consider anonymizing the data first. Remember that data protection laws - and so the requirement to retain data for “no longer than is necessary” - apply only to personal data. Data that is not personal falls outside of data protection law and, in principle, can be retained indefinitely.
- Retention periods. Make sure you have at least some concrete justifications for why you keep data for the periods you do, rather than a vague “because it might be useful someday” type argument. Even with these justifications, your view of why it is “necessary” to keep data will likely not always align with what a regulator, data subject, or court considers “necessary”, but it at least goes some way to show you put considered thought into your retention program