Selecting query terms to build a specialised corpus from a restricted-access database

Research output: Contribution to journalArticle

Abstract

This paper proposes an accessible measure of the relevance of additional terms to a given query, describes and comments on the steps leading to its develop-ment, and discusses its utility. The measure, termed relative query term rele-vance (RQTR), draws on techniques used in information retrieval, and can becombined with a technique used in creating corpora from the world wide web,namely keyword analysis. It is independent of reference corpora, and does notrequire knowledge of the number of (relevant) documents in the database. Although it does not make use of user/expert judgements of document relevance,it does allow for subjective decisions. However, subjective decisions are triangu-lated against two objective indicators: keyness and, mainly, RQTR.
Original languageEnglish
Pages (from-to)5-44
Number of pages40
JournalICAME Journal
Volume31
Publication statusPublished - Apr 2007

    Fingerprint

Keywords

  • corpora
  • corpus building
  • text database
  • query expansion
  • query term relevance
  • keywords

Cite this