In Search of America: Topic Modelling Nineteenth Century Newspaper Archives

Quintus Van Galen, Bob Nicholson

Research output: Contribution to journalArticle (journal)peer-review

2 Citations (Scopus)
220 Downloads (Pure)


This article considers how, and why, “Topic Modelling” tools can be used to analyse historical newspaper archives. While a growing number of media and communication studies projects have applied these techniques to corpuses of born-digital journalism, using the same tools to analyse large-scale collections of historical newspapers requires us to overcome additional technological and methodological challenges. Our discussion is framed around a historical case study examining references to the United States in the 19th Century British Library Newspaper Archive. The article begins by highlighting the problems that researchers of both digital and historical journalism face when attempting to deal with an enormous body of evidence. Next, it argues that Topic Modelling offers one potential solution to these problems by providing a way to “distant read” the archive. The remainder of the article is divided into five experiments that demonstrate how Topic Modelling can be applied to a series of research questions, each of which is applicable to other projects that might make use of newspaper archives. As well as demonstrating the investigative potential of topic modelling, the article also highlights the practical and technological barriers that currently undermine its effectiveness, particularly when it is applied to archives of historical material.
Original languageEnglish
JournalDigital Journalism
Early online date25 Sept 2018
Publication statusE-pub ahead of print - 25 Sept 2018


  • Anglo-American relations
  • digital humanities
  • distant reading
  • journalism history
  • nineteenth century
  • topic modelling
  • transatlantic

Research Centres

  • Research Centre for Nineteenth-Century Studies


Dive into the research topics of 'In Search of America: Topic Modelling Nineteenth Century Newspaper Archives'. Together they form a unique fingerprint.

Cite this