Olds and News

Reflections on Politics, Technicalities and Practicalities of Archiving

Archives are collections of things – of written, spoken or sung records, of seeds and fossils, of art and books and beauty, of the most terrible, most wondrous and most tedious of our histories. They are born of a desire to preserve and to this end, they freeze time, enabling those who will have access to the archive to re-visit these moments. They thus speak as much of the present and future as they do of the past.

Archiving has its roots – both in its etymology and practice – in hierarchies. As a way of structuring knowledge, hierarchies can render archives legible and navigable, but they always bear the imprint of those involved in their making. As such, archiving has always been a way of assigning value to things deemed worth keeping by those in power. As a practice carried out by the few, archiving was used as a tool to affirm colonialism and (academic) white supremacy. Colonial archives disregarded the contexts in which texts and objects were created and to which they belonged. In so doing, they re-interpreted, re-classified and re-evaluated from a white perspective, erasing the knowledge they purported to preserve through theft and appropriation. As the German archeologist and ethnographer, Leo Frobenius, himself an advocate for colonialism and proponent of racist theories, wrote, "The natural sciences collected and the ethnographic museums swelled like pregnant hippos. But what strange things had been collected in the ethnographic field, one did not know, one still does not know today." [1]

Archives are the product of actions and practices we will collectively call archiving. Archiving consists of many steps that include gathering, sorting, (re-)naming, deeming worthy, contextualizing (or not), appropriating, separating an object from its life-cycle and preserving. As a practice, archiving has rightly drawn criticism. Calls to decolonize and deconstruct the practice have been gaining more and more traction. At the same time, driven by digitalization, contemporary archive work is increasing (e.g. The Living Archives). At the center of the debate on how to practise archiving in a post-colonial context are questions about the processes of knowledge production, and of balancing the construction and cementation of past and present identities with enabling futures and ensuring empowerment.

In the context of SUPERRR Lab’s project, Muslim Futures, we've been dealing with questions of responsible and reflective archiving. In this article we draw on our experiences of these epistemic processes. We are passionate about justice in the context of digital processes – from design and data policies to infrastructures – and our thoughts and ideas also derive from past SUPERRR projects on technology and its social contexts, as well as hands-on experience in archiving (Elisa is a trained archaeologist).

In March 2023, we were invited to Yale University, where Ouassima is a Visiting Researcher at the Center for Race, Indigeneity, and Transnational Migration, to speak with the Intersectional Black European Studies working group on Transnational Archives and benefit from a mutual exchange. This visit prompted us to write down the questions that we regularly ask ourselves and the thoughts and discussions we had there in order to share our reflection process with everyone interested in the topic. In the following text, we want to speak of archiving as a practice that is shaped by many, and sometimes conflicting, intentions. This piece does not provide answers. Rather, it's an invitation to think about the act of archiving, its intricacies and its beauty.

The politics of archiving

What do we mean when we talk about the politics of archiving? Too often, archiving is seen in purely administrative terms, a position which dodges critical questions, rather than recognizing it as the socio-political act that it is. As a multifaceted process, critical and holistic reflections on archiving are indispensable. This text is an attempt to share the questions we've been asking ourselves and open up space for reflections, mistakes and learning.

Becoming the archive: The political, the personal and the collective

Archiving is political, and a political practice – so the question of what is archived by whom and with what intentions matters. What positionalities and situated knowledges inform archiving processes and are being translated into it? The ones who archive become part of the archive, so we must ask who is present in its design and how the decision-making process is constituted.

To archive knowledges and materials in the present(s) leads to remembering in the futures. But for marginalized/racialized communities, our presents are reflective of the pasts, raising questions about the kind of materials and sets with which we are working. What has already been protected, archived and held as valuable enough to be restored, and how should these items stand in relation to post-colonial archives? When communities decide to archive, it is an act of resistance, of knowledge production that disturbs hegemonic remembrance cultures. It endows communities with the power to disrupt what was deemed worthy of archiving before and what was not. Yes, archiving is political and a political practice – so to prevent the reproduction of hegemonic logics, continuous self-reflection and reflection on its (digital) structures is crucial. By transparently documenting the process and addressing the gaps openly as a practice of deconstruction, these reflections can be materialized, showing what has or has not been archived, and why.

We know it's possible to get lost in critical reflections and thinking, in the “meta levels” of the archiving process. But the doing is as important as the result, and it is part of the process to make mistakes, to learn while archiving, to test and create prototypes, to go back to the materials and take it from there. It is not a linear process, nor a perfectly designed one – especially when it comes to community-led archiving. Here, there’s room for autonomy and claiming the process as one’s own. The process itself becomes part of the archive.

Who makes the work flow?

When building and filling an archive, it's important to ask who is present and who is not. Digital archiving is often described as a technical workflow from which human factors are seemingly absent. But as a political act, naturally, it is not. Digitizing and archiving both involve tasks and flows of work that range from fascinating and empowering, to tedious and repetitive. People with different kinds of qualification and levels of expertise are needed to make a digital archive, which raises many questions. Who designs the data structure and who decides which tools to use? Who interprets original sources and provides contextual information, and on what basis? What work is carried out by students, researchers, volunteers from the community? What tasks are outsourced and what work is paid? Is it the qualified work, the defining work, the tedious work?

Access to technical systems is typically managed by assigning “user roles” to people’s accounts, for example administrator, moderator, editor or user. Each role comes with different rights and options, from "active" administrator to "passive" user. Are these roles pre-defined by the technical system or are they tailored to the respective archive and the communities behind them? How well do user roles translate into the demographics of a user (or rather, contributor) base? Do they imply different levels of worthiness or power where there are no grounds for it? These questions need to be asked from both technical and societal perspectives.

These reflections go hand-in-hand with documenting the relationships between materials and archivists. Firstly, what exactly are the materials? What are the lexical layers of the objects that are to be archived and what are their primary associations? What lexical layers are not included, and for what reason? And why not make archiving a sensual practice? These and other impressions are worth documenting as they inform the archiving process, especially when it’s the first time someone is archiving. Could we pay attention to the sensory relationships that emerge from interactions between the material and the archivist? When you hold an old magazine in your hands, how does it feel? Is there a particular smell connected to it? What stories could arise from these sensory moments?

Another constellation in the archiving cosmos is the relationship between the authors of materials and their intended readers. Who is the archive being catered towards and who excluded? How does the medium influence a text or a product, for example, and can this be documented? Can the archive continue to pay tribute to these dynamics through the passage of space and time? Imagine the archive one hundred years from now – what would you observe? What would be an ideal future scenario for this archive, and the dynamics and histories it contains? And last but not least, what will happen to the original materials? Are they accessible somewhere, are they centralized? Do they risk being lost? As we plant the seeds of the digital archive, we should safely archive its analogue counterpart in parallel.

The politics of archiving: Archiving as a political/critical practice

We stated at the beginning that we are posing more questions than we are offering answers or solutions. According to the practicalities, data and materials, contexts and communities involved, the process of archiving changes. This is a good thing. It allows for community-led, contextualized and dynamic approaches towards archiving as a political practice, while bearing in mind the politics of archiving, its power dynamics and historical continuities. As mentioned above, archiving is a dynamic process and the people involved become part of the archive, step by step, material by material, from archive to archive. This allows for reflections, learning, testing and repeating. By reflecting on the politics of archiving, the dimensions of technologies and the economies of (digital) archiving come into focus.

Technologies and economies of (digital) archiving

Digital archives are the products of processes that are both technical and social. These processes are shaped by the people who archive and the resources available to them – be it time, money, experience or access to technical gadgets. At the same time, a digital archive is a technical tool, and like all technology, it can be detached neither from the end for which it was created nor from the social context in which it is embedded. This, of course, means that the politics, technologies and economies of archiving can never be cleanly separated. There are, however, aspects of archiving that we tend to consider from a technical, process-oriented point of view, the political effects of which we often fail to analyze. Because of this, the technical and political aspects must be connected from the start and the goals of digital archiving projects must be clear and transparent. There is neither a single nor a right way to digitally archive physical sources, but for precisely this reason, the process and technologies behind it need to be designed and documented to allow for the interpretation of its products.

Meta matters

Building a digital archive always involves abstraction. No source, no object, no text can be digitized in its entirety. While archives are usually clear about the data they contain – digital reproductions, metadata, contextual information – it is not obvious which potentially relevant aspects are not reflected in the data. Archives should indicate what data is and is not collected in order to aid reflections on the kinds of questions its contents can and cannot answer. This is, additionally, affected by the technologies used to digitize or to automate the archiving process. What technologies are used to automate the process, and do they – like optical character recognition (OCR) – present a potential source for mistakes? Are they documented, and is it clear how they affect the digital output? What data and file formats are used, and how accessible and long-lasting are they?

The power in data

Data is power in every way, and can be used or misused. Because archives contain data which cross-reference other sources, even entire separate archives, they are larger than the sum of their parts, creating relations between authors, objects, texts and images. Contemporary archives are likely to contain pieces of personal information relating to living individuals that might seem negligible in isolation but together can provide a detailed snapshot of a private life – from networks of friendship and proof of participation in events, to patterns of movement. The question of what should not be archived and made publicly accessible is an issue worth reflecting upon, especially in the context of racialized communities and their archiving endeavours. Are there safeguards in place to protect personal data? Can linking data sets across sources lay open community structures that outsiders can use in a harmful way? Does it make sense to anonymize selectively with regards to community care and security?

The question of language(s)

Digital archives, like most digital technologies, predominantly use English or other colonial languages, mostly on the technical level but also at the level of content. Discourse in these languages is thereby favoured over others, perpetuating colonial power structures. Translation is often avoided because it is expensive, and while it is possible to auto-translate content, this comes with additional challenges and dilemmas. Auto-translation works well between Germanic and Romanic languages, but less well with others. The quality of the auto-translation correlates with proximity between the Germanic/Romanic language families, how widely a language is spoken (indicating a larger market, an economic factor for auto-translation service providers) and how many text samples are digitally available to train language models (a resource question for auto-translation service providers). All of these factors contribute to enhancing auto-translation results in colonial languages, making these languages more visible and increasing the proportion of the internet that these languages take up.

This dilemma is starkly apparent on Wikipedia for languages like Cebuano or Waray. While both have a notable number of first-language speakers, Wikipedia articles in these and other languages have mostly been created by auto-translation bots. As a result, Cebuano is now the second biggest Wikipedia by article count today, but its auto-translated articles mostly refer to topics, personalities and localities in other countries, burying content that was created locally and/or on topics more relevant to Cebuano or Waray speakers. As one Filipino Wikipedian explains online, the few human contributors now face the impossible task of maintaining articles they didn’t write. Moreover, there is no formal human oversight of the quality of the (auto)translations, leading to unintelligible or misguiding articles that lessen the credibility of the entire project, including those written by human authors.

This challenge is not unique to Wikipedia but applies to all archiving projects that use auto-translation, especially when they rely on communities to contribute to their archive. Lastly, auto-translating content takes money from skilled translators of under-represented languages. All of this shows that there is no simple or correct way to address the localization of digital archives, but there is a need for informed decision making.

Legalities of archiving

Digital archives exist within a variety of legal frameworks that influence their content, their rights of usage and their impact. The starting point is the right to publicity and privacy when it comes to sources that relate to living persons. Legal ramifications vary from country to country and need to be taken into account, but they only serve as a bottom line for digital archives. Are names or images published without informed consent? Do individuals have the right to withdraw their consent to have their data archived and published. Are there processes in place for redress? Beyond these personal rights, copyright plays an important role in digital archiving. It differs between countries, depending on the original copyright holder. But questions of copyright don't touch only upon the sources that end up in a digital archive – they also apply to the archive itself. When sources are digitized, the digital products themselves may be copyrighted and restrict usage, especially in countries without a fair use law.

Permissive licenses, like Creative Commons licenses, can serve as a replacement for national copyright and allow others to use digital copies, depending on the level of restriction chosen. For so-called “traditional knowledge”, especially when it comes from underrepresented groups or even groups at risk of survival, open licenses are crucial and problematic at the same time. Crucial, because they allow their own community to use and remix material freely. Problematic, because that same material is open for others to use as well. Big Tech companies like Meta and Google are using these data sets to improve their language models, but without compensating their authors in any meaningful way, and often without consent. Within the legalities of digitizing and publishing archives exists a gap, accounted for by post-colonial power imbalances.

Outcomes first, details later?

In this article we've sought to share the questions that have arisen over the course of our past and ongoing archiving projects and our thoughts about them. It is an attempt to encourage discussions and to connect to existing work.

The meta dimensions of digital archives are inextricably linked to the practical ones. Understood together, these aspects constitute a movement towards a critical practice. The Politics, Technologies, Economies and Legalities of archiving are interwoven, influence each other and are translated into the contexts in which the archiving is taking place. Understanding the specificities of those translations comes with practice, so in undertaking a digital archive, the first thing is to start the project, focus on initial outcomes (digitizing, registering/indexing, uploading one material at a time etc...), and take it from there. Context always matters – a solution for one archiving project might not be suitable for another – and while reflecting critically in the early stages can save time, frustration and resources, the process does not need to be over-engineered. Creating a space in which trying, and learning with and by the process is fundamental. Test the process to see whether it fits the specific material and context you're working with, and is viable with the resources you have to hand. If it doesn't, try another route. In this way, by allowing for reflections and informed decisions on knowledge production, community needs and the goals for the archive, the process becomes a critical practice.

These approaches might seem contradictory – but we are arguing for nuance and a healthy relationship with simultaneous and non-linear routes. Shall we take it from here?

[1] Frobenius, Leo: Vom Schreibtisch zum Äquator. Frankfurt am Main: Frankfurter Societatsdruckerei, 1925.