What is the first thing you think of when presented with 200 years of chapbooks printed in Scotland? Shawn Graham (Associate Professor of Digital Humanities at Carleton University) decided to turn them into music…


Topic modelling the data

Spanning 1671-1893, and containing nearly 11 million words, the Chapbooks printed in Scotland collection provides some exciting opportunities for analysis.

Shawn Graham created a topic model of the dataset, sharing the challenges he faced working with the data along the way around normalising the data, working with messy OCR and dealing with big datasets without high performance computing resource.

The resulting topic model identified broad themes including love, history and religion.


A Song of Scottish Publishing

Using the TwoTone app, he then transformed this data into music, creating a ‘Song of Scottish Publishing’.

Visualisation from TwoTone app displaying frequency of topics over time
Visualisation from TwoTone app displaying frequency of topics over time

Different instruments represent different topics, so, for example, the trumpet represents chapbooks which feature the ‘fortune-making’ topic; the double-bass relates to ‘histories’; the harp is for chapbooks with themes of love.

The result is a musical ‘data visualisation’ of over two hundred years of Scottish chapbooks.


Find out more

Read about this project on Shawn Graham’s blog: A Song of Scottish Publishing 1671-1893

Listen to the music on Soundcloud: Sonification of Scottish chapbooks

This project was nominated for the DH Awards 2019: DH Awards 2019 website


Which dataset did this project use?

This project used Chapbooks printed in Scotland: Chapbooks printed in Scotland on the Data Foundry website