Why data minimisation matters in the age of AI-powered cyber attacks

AI is changing the risk profile of cybersecurity, turning data minimisation from a privacy nicety into a frontline defence. The less data you hold, the less there is for AI‑enabled attackers to exploit.

Photo by Pawel Czerwinski / Unsplash

There’s a shift happening in cybersecurity that is making it into mainstream governance conversations about now: attackers are now using the same AI tools that organisations are enthusiastically adopting. And that changes the threat landscape considerably.

For years, data minimisation has been one of those principles that everyone agrees with in theory, but quietly sidelines in practice. Storage is cheap, data is “strategic,” and future use cases are always just around the corner. So we keep things. Just in case. But “just in case” is starting to look like a liability.

The asymmetry has changed

AI has dramatically lowered the cost and skill required to execute sophisticated cyber attacks. What used to require time, expertise, and coordination can now be automated, scaled, and refined with alarming ease. Attackers can now:

Rapidly analyse large, unstructured datasets once exfiltrated
Saving encrypted datasets for later when quantum computing will enable advanced decryption (since storage is so cheap)
Generate highly convincing phishing and social engineering campaigns using contextual data
Identify sensitive patterns or relationships buried in otherwise “low value” data
Iterate attacks in real time based on responses

This creates a new kind of asymmetry. Organisations are still thinking in terms of perimeter defence and compliance checklists, while attackers are thinking in terms of data exploitation at scale.

And the more data you hold, the more raw material you are offering them.

Data is no longer inert

One of the more dangerous assumptions in traditional data governance is that stored data is relatively passive. It sits in databases, archives, or backups, waiting to be used. That assumption no longer holds.

With modern AI tools, even poorly structured, incomplete, or seemingly trivial datasets can be transformed into intelligence. Fragments can be stitched together. Context can be inferred. Identities can be reconstructed.

What used to be “harmless” data exhaust is now a potential attack surface.

This is particularly relevant for:

Historical datasets retained beyond their original purpose
Logs and metadata that reveal behavioural patterns
Customer interaction records and communications
Internal documents and knowledge repositories

In other words, the long tail of data that most organisations barely think about.

Data minimisation as a security control

Data minimisation has traditionally been framed as a privacy principle. Collect less. Retain less. Use only what you need. That framing is now incomplete.

Data minimisation is increasingly a core cybersecurity control.

If an attacker gains access to your environment, the impact is directly proportional to what they can access and exploit. Reducing data holdings reduces the blast radius. It also reduces the ability for attackers to derive additional insights through AI-driven analysis.

This is not just about compliance with privacy regulation. It’s about resilience.

A useful mental shift is this: Don’t ask “what data might be useful someday?” Instead ask: “what data would I regret losing control of?”

The economics of keeping data have flipped

For a long time, the argument for retaining data was straightforward:

Storage costs were falling
Data could potentially unlock future value
Deleting data felt like losing an asset

But AI changes the risk side of that equation. The marginal cost of storing data may be low, but the marginal risk has increased significantly.

Every additional dataset:

Expands your attack surface
Increases breach impact
Complicates governance and oversight
Creates new opportunities for misuse or unintended inference

In effect, data is no longer just an asset. It is also a liability with compounding risk.

Practical implications

This isn’t a call for indiscriminate deletion. It’s a call for intentionality. Organisations should be actively:

Reviewing retention policies with a security lens, not just compliance
Identifying “dark data” that has no clear purpose or owner
Aligning data collection practices with defined use cases
Embedding minimisation into system and process design, not as an afterthought
Treating data lifecycle management as a core governance capability

Importantly, this needs to be led from the top. Data minimisation often fails because incentives favour accumulation, not restraint.

A cultural shift

At its core, this is a cultural issue. For years we’ve encouraged organisations to believe that more data is always better, and that value comes from relentless accumulation. We’ve hoarded our stores of data like dragons hoarding their gold. That mindset no longer serves us. In an environment where AI lowers the bar for attack capability and increases the risk calculus, disciplined data practices are no longer optional; they’re foundational. Data minimisation isn’t about doing less with data, it’s about being intentional about what you keep - and why.