‘Historical Research in the Digital Age’, Part 4: ‘Researching with Big Data; and how historians can work collaboratively’

by | Feb 7, 2023 | General, Guest Posts, Historical Research in the Digital Age | 0 comments


In this post we continue our new series — Historical Research in the Digital Age’ — which explores historians’ use and understanding of the digital tools and sources that shape modern research culture. The series explores the impact and implications of digital resources (positive and negative) for how historians work today.

In Part Four we hear from Ruth Ahnert who is Professor of Literary History & Digital Humanities at Queen Mary University of London. Here, Ruth considers how historians can work with big data, with reference to the need for and approaches to interdisciplinary collaboration. Ruth draws on her experience of leading Living with Machines, an interdisciplinary project bringing together historians and data scientists, and based at the British Library and Alan Turing Institute. Ruth and fellow researchers describe the project — and the opportunities and challenges of interdisciplinary working — in their new book, Collaborative Historical Research in the Age of Big Data, published this month and freely available Open Access.

The Historical Research in the Digital Age’ series is hosted by Ian Milligan from the University of Waterloo, Ontario, whose new book, The Transformation of Historical Research in the Digital Age, is now available as a free Open Access download from Cambridge University Press. Later in the series we’ll hear from historian-users of digital and tools, and from those who research in environments where digital resources remain limited.



Digital History is looking increasingly attractive due to the availability of digitised content, as discussed in the Parts 1 and 3 of this series. This opportunity is accompanied by a challenge, though. Traditionally historians have not been trained in technical or statistical skills to leverage digital content without the aid of user interfaces. A small part of the solution to this challenge lies in the provision of additional digital training in humanities programmes. But, for the most part, interdisciplinary collaboration offers the most realistic solution: bringing together teams that combine domain specialist knowledge in the sources, and those with the technical skills to computationally leverage source material at scale. 

While an increasing number of scholars are conceiving, managing and executing large scale multi-disciplinary projects in this space, the practice is still young. This means that there are few models of how to go about such a process. While scientists in various subfields learn about how a lab is run from early in their research careers, by contrast historians who find themselves involved in large projects or collaborative initiatives often do not have a blueprint to look to.  

This is the gap that my co-authors Emma Griffin, Mia Ridge, Giorgia Tolfo and I were seeking to address with our new open access book Collaborative Historical Research in the Age of Big Data: Lessons from an Interdisciplinary Project. The book shares lessons from the experience of running a very large collaborative digital project called Living with Machines (LwM), which was funded by UKRI’s Strategic Priorities Fund (administered by the AHRC).

LwM brings together historians, data scientists and research software engineers, curators and library professionals, computational linguists, visualisation experts, and digital humanists. Its aim is to provide a data-driven history of the coming of the machine age in the period c. 1780-1920. The book is designed for those thinking of taking on a digital humanities project whether large or small, and it tackles three key issues: data, infrastructure, and collaboration. Here I will be focusing on the broader issue of collaboration.  



Setting up effective collaboration takes a lot of work. You need to create a shared understanding of the common goals …. But more important than that, you need to create a sense of community.  

We might think collaboration should be easy, that it simply divides a task – distributing work along lines of expertise. But setting up effective collaboration itself takes a lot of work. Our project is a good case study not only because it has a large team from diverse disciplines and professional backgrounds, but also because we had a cold start with most of us never having worked together before. The work of establishing collaboration is multifaceted. You need to create a shared understanding of the common goals, as well as what the different disciplines get out of the process of working together. But more important than that, you need to create a sense of community.  

One of the things that swiftly becomes clear when beginning a project like LwM is that people from different disciplines speak in different languages, using words others don’t recognise, or (worse) using the same words but to mean different things. To think about how to overcome this, we use Peter Galison’s metaphor of a ‘trading zone’, which adapts anthropological studies of the development of pidgin languages and creoles in border zones to allow communication and the exchange of goods.

How did we create pidgin languages and creoles through which we could express our shared project vision? In the book we describe some strategies that worked for us, as well as things that were challenging. These included establishing a project charter to articulate our values, a delivery plan to share our goals, and commitment to treating these as living documents that will evolve as our practice does. We created spaces to learn from one another: we had a reading group which developed into a group that was called ‘hypothesis generation’. We provided in-team training, which in some cases was focused on acquiring or brushing up on skills, but in others on gaining enough of the native language to communicate better with our collaborators. And we had coffee and code sessions where people could bring their technical problems to find shared solutions.  



Collaboration doesn’t just entail working in different ways, it also requires models of credit and authorship that may be unfamiliar in the humanities.

Talking the talk is one thing, but putting words into action is another. One of the most important decisions we took as a project was deciding to jump-start our collaborations by adapting the idea of the minimum viable product (MVP), a concept suggested by Eric Ries in 2009. Conceived in the world of software development, an MVP is a product with just enough features to satisfy early customers, to test hypotheses and to provide feedback for future product development. We employed the modified concept of a ‘minimum research outcome’. The aim was to reach a place where we had first results or proof-of-concept results or outcomes that – if successful – could be iteratively developed in subsequent phases of the project. The mechanism allowed us to transition from the process of conceptualising our working methodology to actively collaborating.   

Collaboration doesn’t just entail working in different ways, it also requires models of credit and authorship that may be unfamiliar in the humanities. Collaborative projects need to be clear from the outset what their policy will be on authorship and credit. Data-driven work requires layers of labour, from data acquisition, to data wrangling or preprocessing, method conceptualisation, implementation, through to historical analysis, writing and editing – and many other things besides. We believe that it is vital that all that work is credited, with generous author lists and fulsome crediting of labour in our other kinds of outputs.  



By sharing our various experiences on LwM – both the good and the bad – we hope to help other teams starting collaborative projects. If you want to come and hear us talk more about the book, please sign up for our book launch on 7 March 2023 at 5pm (sign up here).

On 23 May 2023, I will also be chairing a Royal Historical Society panel on collaborative digital history: ‘Digital History and Collaborative Research: a Practitioners’ Discussion’ with contributions from Daniel Edelstein (Stanford), Maryanne Kowaleski (Fordham), Jon Lawrence  (Exeter) and Katrina Navickas  (Hertfordshire). Further details and booking will open shortly. 


About the Author


Ruth Ahnert is Professor of Literary History & Digital Humanities at Queen Mary, University of London. A specialist in early modern literary culture, Ruth’s publications include The Rise of Prison Literature in the Sixteenth Century (2013) and an edited collection, Re-forming the Psalms in Tudor England (2015).

Since 2012, Ruth’s work has increasingly been focused on applying data science to research in the humanities. Previous projects include the application of quantitative network analysis to the study of early modern letters. Her recent publications include The Network Turn: Changing Perspectives in the Humanities (2020, with Sebastian E. Ahnert, Nicole Coleman and Scott Weingart) and Collaborative Historical Research in the Age of Big Data (2023, with Emma Griffin, Mia Ridge and Giorgia Tolfo) which draws on her experience of interdisciplinary project work as Principal Investigator for Living with Machines based at the British Library and Alan Turing Institute. 






‘Historical Research in the Digital Age’ is a 6-part series of posts on the Royal Historical Society’s blog, published between December 2022 and February 2023. The series is designed and hosted by Ian Milligan, Professor of History at the University of Waterloo, Ontario. It’s prompted by Ian’s new book, The Transformation of Historical Research in the Digital Age (available Open Access via Cambridge University Press, 2022), which considers the impact and implications of digital resources for contemporary historical practice.

In addition to his own essay, ‘We Are All Digital Now: And what this means for historical research’ (December 2022), Ian invites four contributors to continue the discussion from several perspectives:

  • the builder of digital tools for historians: Part 2, with William J. Turkel
  • the archivist-interpretator who mediates between resources (and their commercial providers) and users: Part 3, with Anna Mcnally
  • the collaborative and interdisciplinary researchers who bring historical and computer science knowledge to big data: Part 4, with Ruth Ahnert 
  • the many without access to such resources given the many ‘digital disparities’ of infrastructure and sources that exist: Part 5, with Gerben Zaagsma
  • the historian-user who applies digital resources to their work, and the implications of this: Part 6, with Jo Guldi






Part One‘We are all Digital Now: and what this means for historical research’, by Ian Milligan




Part Two‘Tools for the Trade: and how historians can make best use of them’, by William J. Turkel




Part Three‘Why Archivist Digitise: and why it matters’by Anna Mcnally




Part Four: ‘Researching with Big Data; and how historians can work collaboratively’, by Ruth Ahnert




Part Five: ‘Digitising History from a Global Context; and what this tells us about access and inequality’, by Gerben Zaagsma






The Society’s blog, Historical Transactions, offers regular think pieces on historical research projects and approaches to the past. These include several previous series, addressing wide-ranging questions concerning historical methods and the value of historical thinking.

Recent contributions to series include ‘Writing Race’ and ‘What is History For?’ We welcome proposals for other short series of posts, bringing historians together to discuss topics, practices and values. If you’d like to suggest a RHS blog series, please email: philip.carter@royalhistsoc.org.

Follow This Blog


* indicates required