The National Librarian’s Research Fellowship in Digital Scholarship 2024-25

Dr Andrea Kocsis is a Chancellor’s Fellow in Humanities Informatics at the Univeristy of Edinburgh. She comes from an interdisciplinary and international background, and in her research, she combines heritage studies with data and network science.  Andrea has a profound interest in digital storytelling, especially in web development, facilitating the interaction between GLAM institutions and their users. As a National Librarian’s Research Fellow in Digital Scholarship (2024-25), her goal is to make web archives more accessible to wider audiences.


Connecting web archives with users through the health information in the Archive of Tomorrow collection

The project is committed to using research conducted on the UK Web Archive’s Archive of Tomorrow (AoT) dataset as a gateway into interacting with web archives. While web archives might seem intimidating at first glance, they are a wealth of knowledge for a variety of users, from professional researchers to everyday library visitors who wish to better understand our recent past. By designing new interfaces and resources, the project aims to make them more accessible both for researchers and broader audiences.

The Archive of Tomorrow – Talking about Health project ran from 2022–2023, collecting health information online. During this time, Andrea collaborated with the web archivists at the University of Cambridge to conduct a Machine Learning pilot research in order to understand the collection’s true potential. Building on these preliminary results, Andrea will use AoT data to facilitate the understanding of web archives, explore their useability, and help users connect with the Library and each other by discovering our shared discussion on health.

The project has three objectives:

1. Develop an interactive web app and display screen to explore and play with the dataset

Aiming at a broad audience, the interactive web platform, where users can explore and experiment with the data and the results, will serve as a gamified interface to the collection.

2. Jupyter notebooks to add to the Data Foundry’s notebook collection

For those who would like to engage more profoundly with the dataset through distant reading, the project offers Jupyter notebooks on how to rehydrate the articles from metadata and how to do some basic Natural Language Processing on them.

3. Entry-level technical workshops

To bridge the potential digital literacy gap between users and dataset creators, the project offers beginner, non-coding workshops on distant reading using the collection as an example.


Related links

More about Andrea’s project:

Andrea Kocsis ‘Heritage in the Digital Age’ blog: Web archives are not boring

Project proposal website: Predicting Misinformation in the Archive of Tomorrow Dataset

Find out more about previous Fellowship projects:

Dr Yann Ryan, The National Librarian’s Research Fellowship in Digital Scholarship 2023-24

Dr Gustavo Candela, The National Librarian’s Research Fellowship in Digital Scholarship, 2022-23

Dr Rosa Filgueira, The National Librarian’s Research Fellowship in Digital Scholarship, 2021-22

Dr Giles Bergel, The National Librarian’s Research Fellowship in Digital Scholarship, 2020-21