Keyness: Appropriate metrics and practical issues

Costas Gabrielatos, Anna Marchi

    Research output: Contribution to conferencePaper

    Abstract

    In this paper we examine the definitions of two widely-used interrelated constructs in corpus linguistics, keyness and keywords, as presented in the literature and corpus software manuals. In particular, we focus on a. the consistency of definitions given in different sources; b. the metrics used to calculate the level of keyness; c. the compatibility between definitions and metrics. Our survey of studies employing keyword analysis has indicated that the vast majority of studies examine a subset of keywords – almost always the top X number of keywords as ranked by the metric used. This renders the issue of the appropriate metric central to any study using keyword analysis. In this study, we first argue that an appropriate, and therefore useful, metric for keyness needs to be fully consistent with the definition of keyword. We then use four sets of comparisons between corpora of different types and sizes, in order to test whether and to what extent the use of different metrics affects the ranking of keywords. More precisely, we look at the extent of overlap in the keyword rankings resulting from the adoption of different metrics, and we discuss the implications of ranking-based analysis adopting one metric or another. Finally, we propose a new metric for keyness, and demonstrate a simple way to calculate the metric, which supplements the keyword extraction in existing corpus software.
    Original languageEnglish
    Publication statusPublished - 2012
    EventCorpus-assisted Discourse Studies International Conference - University of Bologna, Italy
    Duration: 13 Sept 201214 Sept 2012

    Conference

    ConferenceCorpus-assisted Discourse Studies International Conference
    Country/TerritoryItaly
    Period13/09/1214/09/12

    Keywords

    • corpus linguistics
    • keyness
    • keywords
    • metrics
    • frequency difference
    • effect size
    • statistical significance

    Fingerprint

    Dive into the research topics of 'Keyness: Appropriate metrics and practical issues'. Together they form a unique fingerprint.
    • Keyness Analysis: nature, metrics and techniques

      Gabrielatos, C., 7 Feb 2018, Corpus Approaches to Discourse: A Critical Review. Taylor, C. & Marchi, A. (eds.). Oxford: Routledge, p. 225-258 34 p.

      Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

      Open Access
      File

    Cite this