Australia’s data sovereignty ambitions will fail without research compute

Australia’s lack of serious, at‑scale compute is not just a research productivity problem, it is a sovereignty problem as well. Our universities are trying to do world‑class work on infrastructure that simply does not match the ambitions in our national AI strategies or data sovereignty rhetoric.

Australia’s data sovereignty ambitions will fail without research compute
Photo by Conny Schneider / Unsplash

Reminiscing

One thing people might not know about me is that in early 2013 I joined the UNSW Faculty of Engineering as IT Manager. My remit covered teaching labs for our undergraduate and postgraduate students, research computing, and all the usual administrative systems.

At the time, Engineering was the largest faculty in the university and had the biggest concentration of active researchers. Part of my job was looking after roughly 20 high‑performance compute (HPC) clusters that underpinned research across the faculty, many of them running flat out on extremely compute‑hungry workloads.

I remember standing in a corridor chatting with three different academics, just trying to get a sense of how much data we had under management. By the third office we were talking in exabytes and I had given up trying to do the maths in my head, other than thinking: that is a lot of data. On another occasion, after I had managed to procure some extra research compute, a fluid dynamics researcher casually asked if I happened to have a spare petabyte because he had already filled what he had. Now it is worth noting that this was in the olden days, long before AI had become the talk of the town.

So when I talk about the challenges of compute‑intensive research, I am not doing it from the sidelines. I have seen up close what this means for Australian universities, and I have strong views about what it means for our international competitive positioning.

The problem

Australia’s lack of serious, at‑scale compute is not just a research productivity problem, it is a sovereignty problem as well. Our universities are trying to do world‑class work on infrastructure that simply does not match the ambitions in our national AI strategies or data sovereignty rhetoric.

Compute and the research squeeze

Australian researchers can tap into national facilities and institutional clusters, but these are heavily contested, oversubscribed, and often behind global best practice in terms of GPU capacity for modern AI workloads. For many projects, once you add storage, data transfer, and specialised support, the practical ceiling on what can be done inside Australia is much lower than policymakers tend to assume.

So teams end up stitching together whatever they can get from local and international HPC allocations, small internal clusters, and credits on overseas and local clouds. It is a fragile patchwork that works for modest experiments, but it does not support sustained frontier‑scale AI research or large multi‑institutional data projects.

Many of our serious researchers book time on large scale international HPC resources in what are now volatile regions such as Saudi Arabia and UAE (where data centres have been recently attacked) and in the US, where we are subject to the various funding cuts to science and technology budgets.

The Australian government provides some support via the good folks at the Australian Research Data Commons (ARDC); however, they are dwarfed by the scale of the demand and the availability of resources to support Australian research.

AI is pushing the limits

AI research is now right at the edge of what most institutional infrastructure can handle. Training and fine‑tuning large models demand enormous amounts of compute, fast networks, and specialist operational expertise, and the bar keeps rising every year.

This is not just about bigger models for the sake of it. If Australian researchers want to work on safety, alignment, evaluation, or domain‑specific models in areas like health, climate, or defence, they need access to systems that look and feel like the ones used at the global frontier. Right now, too many of our projects are constrained to “toy” scale experiments that cannot easily be translated into production‑grade systems.

Sovereign AI needs sovereign compute

We like to talk about “sovereign AI” as if we can regulate our way to autonomy while renting most of the underlying infrastructure from offshore hyperscalers. In reality, sovereignty in AI is about control and resilience, and compute, chips, and energy are now strategic resources in their own right.

If we cannot train or even reliably run critical models on infrastructure that is physically in Australia, under Australian jurisdiction, and operated by people who work to Australian law and standards, then we are not sovereign, we are tenants. At best we have a kind of “AI tenancy” arrangement, where access to essential capability ultimately depends on commercial terms, foreign policy settings, and someone else’s risk appetite.

Geopolitics and AI risk

All of this sits inside an increasingly tense geopolitical environment. Compute, advanced chips, and AI capability are now wrapped up in export controls, sanctions regimes, and strategic competition between major powers. That means the platforms Australian researchers rely on today may not be as dependable tomorrow as our risk models assume.

If a significant share of our research workloads lives on infrastructure controlled by companies headquartered in other jurisdictions, then our compute pipeline is exposed to decisions made in foreign capitals and foreign boardrooms. It is not hard to imagine scenarios where access, pricing, or permissible workloads change quickly in response to geopolitical shock. Building sovereign capability is partly about reducing that exposure, so that core public interest research can continue regardless of which way the geopolitical winds are blowing.

Data sovereignty and university research

The same tension shows up in data sovereignty. There is a growing push to keep sensitive government, health, defence, and critical infrastructure data onshore and subject to Australian law. Yet our undercooked research compute story means that to do serious AI work with those datasets, universities are often nudged towards foreign‑owned cloud platforms with opaque data flows.

It is worth pondering what we actually mean by data sovereignty in Australia:

"Data sovereignty refers to the right of a nation to control and manage its own data, regardless of where that data originated and [is] stored. This means that a country has the authority to determine how its data is collected, processed, and shared, as well as enforce its own laws and regulations related to data protection and privacy. Data sovereignty is often linked to national security, as countries may be concerned about foreign access to sensitive data.
Data localisation, on the other hand, refers to the requirement that data be stored within a particular country’s borders. This does not mean that a country has full control of the data as the laws of other counties may also apply. Influence of international digital platforms, REPORT - November 2023"

Policy frameworks recognise that Australian data is essential to building AI that reflects the diversity and complexity of our society, and that we need consistent standards and national data infrastructure to make that usable for research. But when the compute that can actually process those datasets at scale lives elsewhere, our “data sovereignty” risks becoming performative: the bits might be notionally onshore, but the meaningful capability to work with them sits offshore.

The emerging sovereign AI gap

Government and industry are slowly waking up to these issues. There is growing recognition that Australia should focus on “sovereign inferencing”: being able to run world‑class models on infrastructure we control, even if some of those models are developed elsewhere. Recent efforts to stand up sovereign AI infrastructure for government workloads show what is possible, but current national capacity is still described as insufficient for frontier‑scale model development.

Here is the catch for universities: if sovereign AI infrastructure emerges only for public sector and commercial use, while research is left to scramble on legacy clusters and variable cloud deals, we will hard‑code a two‑tier system. Public talk of sovereign capability will not match the reality inside labs and research centres that are still queueing for GPU hours.

What Australia needs to do next

If we are serious about sovereign AI and data sovereignty, then university research compute has to be treated as core national infrastructure, not a nice‑to‑have for a few STEM disciplines. We need a coordinated investment plan that ties together HPC, sovereign cloud, trusted data platforms, and the talent to run them, specifically designed to support open, collaborative research.

That also means designing governance so that researchers can work with sensitive datasets on sovereign platforms without spending half their time navigating bespoke agreements and inconsistent rules across jurisdictions. Otherwise, the path of least resistance will remain “just put it on a foreign cloud and hope the contract covers it”, which is the opposite of sovereignty.

Australia has the talent and the policy language. What we are missing is the infrastructure that allows our universities to actually live up to the slogans. Until we fix that, sovereign AI and data sovereignty will remain talking points while our best ideas quietly depend on someone else’s computers.