Selecting query terms to build a specialised corpus from a restricted-access database

Research output: Contribution to journalArticle

Abstract

This paper proposes an accessible measure of the relevance of additional terms to a given query, describes and comments on the steps leading to its develop-ment, and discusses its utility. The measure, termed relative query term rele-vance (RQTR), draws on techniques used in information retrieval, and can becombined with a technique used in creating corpora from the world wide web,namely keyword analysis. It is independent of reference corpora, and does notrequire knowledge of the number of (relevant) documents in the database. Although it does not make use of user/expert judgements of document relevance,it does allow for subjective decisions. However, subjective decisions are triangu-lated against two objective indicators: keyness and, mainly, RQTR.
Original languageEnglish
Pages (from-to)5-44
Number of pages40
JournalICAME Journal
Volume31
Publication statusPublished - Apr 2007

Fingerprint

Information retrieval
World Wide Web

Keywords

  • corpora
  • corpus building
  • text database
  • query expansion
  • query term relevance
  • keywords

Cite this

@article{aa970007376d4472b19686ee3948f6b5,
title = "Selecting query terms to build a specialised corpus from a restricted-access database",
abstract = "This paper proposes an accessible measure of the relevance of additional terms to a given query, describes and comments on the steps leading to its develop-ment, and discusses its utility. The measure, termed relative query term rele-vance (RQTR), draws on techniques used in information retrieval, and can becombined with a technique used in creating corpora from the world wide web,namely keyword analysis. It is independent of reference corpora, and does notrequire knowledge of the number of (relevant) documents in the database. Although it does not make use of user/expert judgements of document relevance,it does allow for subjective decisions. However, subjective decisions are triangu-lated against two objective indicators: keyness and, mainly, RQTR.",
keywords = "corpora, corpus building, text database, query expansion, query term relevance, keywords",
author = "Costas Gabrielatos",
year = "2007",
month = "4",
language = "English",
volume = "31",
pages = "5--44",
journal = "ICAME Journal",

}

Selecting query terms to build a specialised corpus from a restricted-access database. / Gabrielatos, Costas.

In: ICAME Journal, Vol. 31, 04.2007, p. 5-44.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Selecting query terms to build a specialised corpus from a restricted-access database

AU - Gabrielatos, Costas

PY - 2007/4

Y1 - 2007/4

N2 - This paper proposes an accessible measure of the relevance of additional terms to a given query, describes and comments on the steps leading to its develop-ment, and discusses its utility. The measure, termed relative query term rele-vance (RQTR), draws on techniques used in information retrieval, and can becombined with a technique used in creating corpora from the world wide web,namely keyword analysis. It is independent of reference corpora, and does notrequire knowledge of the number of (relevant) documents in the database. Although it does not make use of user/expert judgements of document relevance,it does allow for subjective decisions. However, subjective decisions are triangu-lated against two objective indicators: keyness and, mainly, RQTR.

AB - This paper proposes an accessible measure of the relevance of additional terms to a given query, describes and comments on the steps leading to its develop-ment, and discusses its utility. The measure, termed relative query term rele-vance (RQTR), draws on techniques used in information retrieval, and can becombined with a technique used in creating corpora from the world wide web,namely keyword analysis. It is independent of reference corpora, and does notrequire knowledge of the number of (relevant) documents in the database. Although it does not make use of user/expert judgements of document relevance,it does allow for subjective decisions. However, subjective decisions are triangu-lated against two objective indicators: keyness and, mainly, RQTR.

KW - corpora

KW - corpus building

KW - text database

KW - query expansion

KW - query term relevance

KW - keywords

M3 - Article

VL - 31

SP - 5

EP - 44

JO - ICAME Journal

JF - ICAME Journal

ER -