What if Open Data is no longer enough?
Rethinking Open Data Through the Lens of Data Commons
.jpg)
Everything Everywhere All At Once
A paper I recently read filled me with the kind of joy that makes you print it out, grab a marker, and highlight every single line. That paper is Towards a Sustainable Data Commons Ecosystem by Dr. Ana Méndez de Andés (Dark Matter Labs) and Semra Sönmez (Open Knowledge Foundation) — building on a 2023 Open Knowledge Foundation blog post by CEO Renata Ávila on the evolving definition of openness.
Open data is currently caught in a dilemma – and the reason for this is, of course, AI.
On the one hand, open data is hailed as an achievement and a prerequisite for a fair and well-informed society – which it certainly is for some communities.
On the other, we witness a predatory system excavating data no matter the licence or copyright agreement standing, with none of the benefits flowing back to the communities that collect and provide that data.
On the contrary – we see more and more that some AI is bluntly undermining social justice and AI infrastructure is often built in places where people are already vulnerable. It's hurting communities and erodes democratic cohesion.
In 2023, the Open Knowledge Foundation (OKFN) announced the rewriting of the Open Definition. „The Open Definition has long been setting the standards for the open data community on how to approach data openness and the surrounding ecosystems.
Acknowledging however, that too often tech goes hand in hand with the clear disadvantage of certain communities and groups, with climate harm, race and gender-based-violence, the OKFN has decided to develop its Open Definition further.
Towards a Sustainable Data Commons Ecosystem to me is a highly valuable update to the Open Definition, which aims to make open data more inclusive and equitable - not only for the Global North where open data originated – but also for the Global South – which has fewer resources to share data openly or simply has different concerns regarding data collections.
The study analyses how communities make sense of shared data in environments that aim to support social justice.
If the current digital environment, as the authors argue, is characterized by data extraction and the concentration of power in opaque AI models that offer communities little in return; if generic technocratic forms of openness contribute to the erosion of social justice, the conventional open data approach focused on the premises that data may be used, distributed and reused by anyone for any purpose, might not serve these communities.
The authors therefore advocate for a broader interpretation of openness: from centering technologies, formats and standards to shift towards the community's seat of life, data governance and the question of data justice as to a fair distribution of benefits and harms.
For that they turn to the concept of data commons. While open data asks: Is the data accessible? What is its quality? Under which licence is it available? — data commons asks: Who has access to it? Who governs it? And who benefits from it?
I think there has always been the assumption that with open data these questions were automatically raised and answered - as a natural by-product of the data for good approach. The current situation and evidence point in a different direction.
—--------------------------------------------------------------------------------
So, as Méndez de Andés and Sönmez point out, in environments where generic, technocratic forms of openness risk eroding community rights and social justice, the data commons approach offers a corrective framework — one that places communities back at the centre of data governance.
The authors outline the data commons dimensions as the following:
-
Technological sovereignty refers to the right to interpretative tools, to representation within data systems and to selective access – including the right to obscure data if communities so wish.
-
Conditions for sustainability address who can sustain the infrastructure: the right to low-tech approaches, to spaces for experimentation beyond the technological mainstream, and to collective economic models that do not depend on the Global North.
-
Supporting structures conceive of data as public infrastructure under community management – with legal instruments such as data trustees, locally adapted licences and organised stakeholder communities.
-
Finally, algorithmic justice makes it clear: AI as a tool should not be used to substitute collective decision-making. Algorithmic justice grants the right to locally based models and federated data structures.
I would like to point out that this is a very condensed and simplified overview — the paper itself goes into considerable depth on the specific rights associated with each of the four aspects.
So what does that mean? Do we need to abandon the current open data approach?
Of course not. However, as open data advocates, we cannot pretend that many communities are not suffering under the current AI extraction frenzy — or that open data is genuinely working in their favour.
What's in it for us? And does it really need to be open? These are valid questions. We can no longer act as though openness automatically equals benefit-sharing, or that the voices and rights of citizens are inherently placed at the centre of the process.
This is precisely why engaging with a data commons framework matters — if nothing else, to make the openness approach more resilient against exploitative AI business models.
