AI’s Hidden Extraction Economy
We can’t unring the bell: How Labour and Knowledge Power “Intelligent” Machines
There is a persistent story doing the rounds that artificial intelligence is somehow weightless. A mind in the cloud. A clever assistant. A productivity layer. A tool that appears when summoned, produces an answer, and disappears again. But it isn’t any of those things.
AI is not magic. It is infrastructure. And like all infrastructure, it has supply chains, labour practices, governance failures, environmental costs, and power relations built into it.
Kate Crawford makes this point clearly in Atlas of AI. AI is not artificial in the sense of being detached from the earth, and it is not intelligent in the human-like way that much public discourse imagines. Contemporary large-scale AI is built from minerals, energy, water, data, labour, logistics, cloud infrastructure, and institutional choices.
That matters because the dominant story about AI still tends to frame it as a solution looking for problems. We are told that AI will improve public services, personalise education, accelerate medicine, optimise climate action, and make organisations more efficient.
Some of that may be true. Machine-learning systems can support accessibility, scientific discovery, public health, climate modelling, and better service delivery. But usefulness is not the same as justice. A system can create value in one place while shifting costs somewhere else. And that is the point we need to keep coming back to.
The ethical question is not simply whether AI can do good. It is whether a particular system is necessary, proportionate, accountable, and governed by the people whose data, labour, communities, and environments make it possible (Floridi et al., 2021). Underneath the marketing language of “AI for good” sits an older story. And that is the story of extraction. Not just the extraction of data, although that matters. Not just the extraction of labour, although that matters too. AI also extracts minerals, energy, water, attention, culture, institutional capacity, and trust.
If we want to govern AI properly, we have to stop treating those costs as externalities. They are the system.
AI is built on hidden work. One of the most persistent myths about AI is that it replaces people. But more often, it hides them.
Behind the clean interface and the instant answer sits an enormous amount of human work. Data has to be collected, cleaned, labelled, filtered, moderated, evaluated, and corrected. Model outputs have to be rated. Unsafe outputs have to be identified. Edge cases have to be tested. In many generative AI systems, reinforcement learning from human feedback depends on people making judgments about which outputs are more useful, more acceptable, or less harmful. This is not incidental. It is part of the production process.
Mary L. Gray and Siddharth Suri call this ghost work: human labour hidden behind systems that are designed to look automated. That phrase matters because it punctures the illusion. The machine is not doing all the work. People are doing work that has been made difficult to see.
Some of this work is data annotation. Some of it is content moderation. Some of it is evaluation, red-teaming, and safety testing. Some of it is the ongoing work of keeping digital systems usable for everyone else. Sarah T. Roberts has shown that commercial content moderation is not a marginal clean-up activity. It is a central condition of the modern internet.
AI does not make this labour disappear. It reorganises it.
Cognitive and linguistic tasks are broken into smaller units. Judgment is turned into workflow. Human context becomes training signal. The people doing this work are often far from the product launch, far from the venture capital announcement, and far from the prestige economy of AI. That should trouble us.
Because when the work is invisible, the risks are invisible too. Pay, trauma, safety, bargaining power, attribution, and accountability all become easier to ignore. And if an AI system cannot be built without hidden, poorly governed labour, then we should be honest about what kind of efficiency is being claimed.
Public data is not free raw material
The second extraction problem is knowledge. Large-scale AI systems do not learn from nowhere. They are trained on text, images, code, records, archives, forum posts, books, social media, public websites, technical documentation, and the accumulated labour of millions of people.
Much of that material was created under very different assumptions. People wrote blog posts for a small audience. They answered questions in forums to help a stranger. They contributed to open-source projects. They uploaded art. They shared stories in communities. They built public knowledge resources for public purposes. Then that material became training data.
This is where the phrase “publicly available data” does too much work. Publicly available does not mean socially unencumbered. Access is not the same as consent. Scrapeable is not the same as fair game.
A poem, a support-forum post, an open-source repository, a Wikipedia contribution, and a private grief shared in a public-ish place are not the same kind of thing simply because a crawler can reach them. Context matters. Purpose matters. Consent matters. Governance matters.
This is not only a copyright issue, although copyright matters. It is also a data governance issue.
- What was collected?
- For what purpose?
- Under what authority?
- With what documentation?
- Who can contest it?
- Who benefits?
- Who carries the risk?
Work on dataset documentation, such as datasheets for datasets, exists because datasets have histories, limitations, intended uses, and social conditions that need to be made visible (Gebru et al., 2018). Without that discipline, collective knowledge goes in and proprietary systems come out. That is not innovation on its own. It is enclosure.
The cloud is not weightless
The third extraction problem is material. We still talk about “the cloud” as though it floats above us. It does not. It sits in data centres, transmission lines, substations, cooling systems, fibre networks, land, water, and hardware supply chains.
Crawford and Vladan Joler’s Anatomy of an AI System makes this visible by tracing the Amazon Echo through mineral extraction, labour, data flows, logistics, cloud infrastructure, and e-waste. That kind of mapping is useful because it forces us to look past the device and past the interface. The device is only the visible part. The system is much larger.
And the language matters here. It is not accurate enough to say AI depends on “rare earths” and leave it there. Some digital components do use rare earth elements, but AI and data infrastructure also depend on critical minerals such as lithium, cobalt, nickel, copper, gold, tantalum, tin, and tungsten. These are not all rare earth elements, but they are all part of the broader material reality of digital systems.
The same is true of energy and water. Training and running AI systems requires compute. Compute requires chips, servers, electricity, cooling, maintenance, and replacement cycles. The footprint varies by model, location, data centre design, energy mix, and use case, so we should be careful about overclaiming. But we should be just as careful about under-governing.
In a CIGI interview, Crawford argues that AI is an extractive industry not only because it draws on data, but because it draws on labour, time, and natural resources. She also notes that the true resource costs of commercial AI systems are hard to assess because so much relevant information is held by companies as proprietary knowledge. That should be a governance red flag.
If we cannot properly see the energy, water, land, labour, and materials required to run these systems, then we cannot properly decide which uses are worth it.
A model used to improve emergency response is not the same as a model used to generate disposable marketing sludge. A system that supports accessibility is not the same as one that produces spam at scale. A tool that helps clinicians is not the same as one that automates surveillance or punishment.
AI changes work because it changes power
The labour-market conversation around AI is often framed in apocalyptic terms. The robots are coming for all the jobs. Or, on the other side, AI will magically free everyone from drudgery. We have seen this movie before.
The evidence is still developing. A recent Stanford Digital Economy Lab overview argues that aggregate employment effects appear limited so far, but that some impacts may be concentrated among AI-exposed entry-level workers. The Australian Parliamentary Library similarly emphasises uncertainty: AI may improve productivity and create new tasks, but it may also increase inequality, concentrate gains, and affect workers unevenly across occupations and demographic groups. So, we should be careful about simple predictions.
But we do not need to wait for perfect labour-market data to see the governance problem. AI changes work because it changes power. It changes who gets to decide how work is measured. It changes what counts as expertise. It changes who is visible and who is replaceable. It changes where accountability sits.
When AI is embedded in productivity suites, contact centres, HR systems, education platforms, welfare systems, and public administration, it does more than automate tasks. It reshapes the environment in which decisions are made.
Workers can become dashboards. Judgment can become a score. Context can be flattened into a metric. Autonomy can be traded away in the name of efficiency.
This is where the “augmentation” story needs scrutiny. Augmentation for whom? On whose terms? With what rights to challenge the system? With what ability to refuse?
We cannot unring the bell
There is no clean return to a pre-AI world. The models exist. The data centres exist. The procurement processes are underway. The investment has been made. The tools are being embedded into everyday software. People are already changing how they work, write, search, code, decide, and organise.
We cannot unknow what we now know how to build. But that does not mean the current path is inevitable. This distinction matters. Irreversibility is not the same thing as inevitability. AI’s extractive form is not a law of nature. It is the result of choices about scale, ownership, data rights, labour conditions, procurement, environmental disclosure, security, and institutional governance.
Governance cannot be bolted on at the end
If there is one practical lesson here, it is this: governance cannot be an afterthought.
We cannot build vast AI systems first and then sprinkle ethics on top. We cannot run pilots, create shadow AI, accumulate data, sign vendor contracts, and only then ask whether the system is safe, fair, sustainable, or necessary.
That way lies the shemozzle. The work has to start earlier. It has to be built into purpose, design, procurement, data collection, model development, deployment, monitoring, retirement, and disposal.
|
Governance question |
Why it matters |
|
What problem are we solving? |
Prevents AI being used because it is fashionable rather than necessary. |
|
What data is needed, and why? |
Keeps data minimisation and purpose limitation at the centre. |
|
Who does the hidden labour? |
Makes annotation, moderation, evaluation, and red-teaming visible as work. |
|
What are the material costs? |
Forces attention to energy, water, hardware, supply chains, and e-waste. |
|
Who can contest the system? |
Turns accountability from a slogan into a practical right. |
|
When should the system be stopped? |
Treats refusal, rollback, and retirement as part of lifecycle governance. |
Data minimisation is a good example. For years it has been treated as a privacy principle that everyone agrees with and then quietly ignores. But in an AI-enabled world, keeping too much data is not just a privacy problem. It is a security problem, a governance problem, and an extraction problem. The less unnecessary data an organisation holds, the less there is to misuse, leak, scrape, infer from, or feed into systems without proper accountability.
The hidden workers of AI need rights
A serious national and international AI governance agenda must include labour. Not as a footnote. Not as a corporate social responsibility statement. As core infrastructure.
The people who label data, moderate content, evaluate outputs, red-team systems, and provide the human judgment that makes AI systems usable need fair pay, safe conditions, psychological support, transparency, and collective power.
They also need recognition. If their work is essential enough to make the system function, then it is essential enough to govern properly.
The same applies to workers affected by AI deployment. Organisations introducing AI need more than tool policies. They need communication, training, consultation, leadership alignment, and clear accountability. AI adoption is not mainly a technology problem. It is a people problem.
Ignore that, and the result will not be transformation. It will be confusion, resistance, misuse, and risk.
We need open institutions, not just open models
There is a lot of talk about open AI. Some of it is useful. But open weights or open code are not enough if the institutions around them remain concentrated, opaque, and unaccountable.
By open AI I generally mean one of the following:
- Open‑source AI - These are models where the code, and often the training data, are released under open licences so anyone can inspect, tinker with, and reuse them - think of the many community models on GitHub or Hugging Face.
- Open access AI - Here the underlying models stay proprietary, but they’re made widely available through APIs and products so lots of people can build on top of them; “open” here is about reach and availability rather than genuine transparency.
- Open benefit AI - In this framing, “open” is about who gains, not how the model is built: the goal is to ensure advanced AI benefits humanity broadly and doesn’t end up tightly controlled by a small group, whether or not the underlying systems are open‑source.
But the deeper question is institutional.
- Who sets the rules?
- Who gets heard?
- Who can inspect the system?
- Who can challenge a decision?
- Who benefits from public knowledge?
- Who carries the environmental and labour costs?
Software and AI are never neutral. They encode choices about identity, visibility, ranking, moderation, access, and control. Those choices can quietly hard-code existing structures of power into the systems that shape everyday life.
That is why we need open, public, civic, and community-governed institutions. Not because openness is a magic word, but because concentrated technological power needs counterweights.
The history of industrial upheaval is not only a history of machines. It is also a history of institutions: unions, mutual aid, cooperatives, public libraries, professional bodies, standards organisations, and regulatory systems. The AI age will need its own institutions too.
Not just better products. Better counter-power.
Doing the work
We cannot unring the AI bell. We have eaten of the tree of AI knowledge. The systems are here, and they will keep changing how organisations and societies operate. But we can refuse the lazy story that extraction is inevitable.
AI does not have to mean endless data hoarding, invisible labour, environmental opacity, vendor lock-in, and weak accountability. Those are choices. Profitable choices, in many cases. Convenient choices. But choices nonetheless.
The work now is practical and political.
Build data minimisation into systems. Make hidden labour visible and properly protected. Demand environmental disclosure. Treat AI as lifecycle infrastructure. Strengthen public procurement. Support open and community-governed institutions. Give affected people meaningful rights to challenge, refuse, and shape the systems being built around them.
This is not glamorous work. It is governance. It is institutions. It is accountability. It is doing the work.
And if AI is going to be part of our future, then that work matters more than ever.