In Search of America: Topic Modelling Nineteenth Century Newspaper Archives

Research output: Contribution to journalArticle

Abstract

This article considers how, and why, “Topic Modelling” tools can be used to analyse historical newspaper archives. While a growing number of media and communication studies projects have applied these techniques to corpuses of born-digital journalism, using the same tools to analyse large-scale collections of historical newspapers requires us to overcome additional technological and methodological challenges. Our discussion is framed around a historical case study examining references to the United States in the 19th Century British Library Newspaper Archive. The article begins by highlighting the problems that researchers of both digital and historical journalism face when attempting to deal with an enormous body of evidence. Next, it argues that Topic Modelling offers one potential solution to these problems by providing a way to “distant read” the archive. The remainder of the article is divided into five experiments that demonstrate how Topic Modelling can be applied to a series of research questions, each of which is applicable to other projects that might make use of newspaper archives. As well as demonstrating the investigative potential of topic modelling, the article also highlights the practical and technological barriers that currently undermine its effectiveness, particularly when it is applied to archives of historical material.
Original languageEnglish
JournalDigital Journalism
Early online date25 Sep 2018
DOIs
Publication statusE-pub ahead of print - 25 Sep 2018

Fingerprint

newspaper
nineteenth century
journalism
Communication
Experiments
communication
experiment
evidence

Keywords

  • Anglo-American relations
  • digital humanities
  • distant reading
  • journalism history
  • nineteenth century
  • topic modelling
  • transatlantic

Cite this

@article{a548470fa92d49f5be3d0a63487a906a,
title = "In Search of America: Topic Modelling Nineteenth Century Newspaper Archives",
abstract = "This article considers how, and why, “Topic Modelling” tools can be used to analyse historical newspaper archives. While a growing number of media and communication studies projects have applied these techniques to corpuses of born-digital journalism, using the same tools to analyse large-scale collections of historical newspapers requires us to overcome additional technological and methodological challenges. Our discussion is framed around a historical case study examining references to the United States in the 19th Century British Library Newspaper Archive. The article begins by highlighting the problems that researchers of both digital and historical journalism face when attempting to deal with an enormous body of evidence. Next, it argues that Topic Modelling offers one potential solution to these problems by providing a way to “distant read” the archive. The remainder of the article is divided into five experiments that demonstrate how Topic Modelling can be applied to a series of research questions, each of which is applicable to other projects that might make use of newspaper archives. As well as demonstrating the investigative potential of topic modelling, the article also highlights the practical and technological barriers that currently undermine its effectiveness, particularly when it is applied to archives of historical material.",
keywords = "Anglo-American relations, digital humanities, distant reading, journalism history, nineteenth century, topic modelling, transatlantic",
author = "{Van Galen}, Quintus and Bob Nicholson",
year = "2018",
month = "9",
day = "25",
doi = "10.1080/21670811.2018.1512879",
language = "English",
journal = "Digital Journalism",
issn = "2167-0811",
publisher = "Taylor & Francis",

}

TY - JOUR

T1 - In Search of America: Topic Modelling Nineteenth Century Newspaper Archives

AU - Van Galen, Quintus

AU - Nicholson, Bob

PY - 2018/9/25

Y1 - 2018/9/25

N2 - This article considers how, and why, “Topic Modelling” tools can be used to analyse historical newspaper archives. While a growing number of media and communication studies projects have applied these techniques to corpuses of born-digital journalism, using the same tools to analyse large-scale collections of historical newspapers requires us to overcome additional technological and methodological challenges. Our discussion is framed around a historical case study examining references to the United States in the 19th Century British Library Newspaper Archive. The article begins by highlighting the problems that researchers of both digital and historical journalism face when attempting to deal with an enormous body of evidence. Next, it argues that Topic Modelling offers one potential solution to these problems by providing a way to “distant read” the archive. The remainder of the article is divided into five experiments that demonstrate how Topic Modelling can be applied to a series of research questions, each of which is applicable to other projects that might make use of newspaper archives. As well as demonstrating the investigative potential of topic modelling, the article also highlights the practical and technological barriers that currently undermine its effectiveness, particularly when it is applied to archives of historical material.

AB - This article considers how, and why, “Topic Modelling” tools can be used to analyse historical newspaper archives. While a growing number of media and communication studies projects have applied these techniques to corpuses of born-digital journalism, using the same tools to analyse large-scale collections of historical newspapers requires us to overcome additional technological and methodological challenges. Our discussion is framed around a historical case study examining references to the United States in the 19th Century British Library Newspaper Archive. The article begins by highlighting the problems that researchers of both digital and historical journalism face when attempting to deal with an enormous body of evidence. Next, it argues that Topic Modelling offers one potential solution to these problems by providing a way to “distant read” the archive. The remainder of the article is divided into five experiments that demonstrate how Topic Modelling can be applied to a series of research questions, each of which is applicable to other projects that might make use of newspaper archives. As well as demonstrating the investigative potential of topic modelling, the article also highlights the practical and technological barriers that currently undermine its effectiveness, particularly when it is applied to archives of historical material.

KW - Anglo-American relations

KW - digital humanities

KW - distant reading

KW - journalism history

KW - nineteenth century

KW - topic modelling

KW - transatlantic

U2 - 10.1080/21670811.2018.1512879

DO - 10.1080/21670811.2018.1512879

M3 - Article

JO - Digital Journalism

JF - Digital Journalism

SN - 2167-0811

ER -