Building a Common Framework for IIR Evaluation

Mark Hall, Elaine Toms

Research output: Contribution to conferencePaper

21 Citations (Scopus)
7 Downloads (Pure)

Abstract

Cranfield-style evaluations standardised Information Retrieval (IR) evaluation practices, enabling the creation of programmes such as TREC, CLEF, and INEX, and long-term comparability of IR systems. However, the methodology does not translate well into the Interactive IR (IIR) domain, where the inclusion of the user into the search process and the repeated interaction between user and system creates more variability than the Cranfield-style evaluations can support. As a result, IIR evaluations of various systems have tended to be non-comparable, not because the systems vary, but because the methodologies used are non-comparable. In this paper we describe a standardised IIR evaluation framework, that ensures that IIR evaluations can share a standardised baseline methodology in much the same way that TREC, CLEF, and INEX imposed a process on IR evaluation. The framework provides a common baseline, derived by integrating existing, validated evaluation measures, that enables inter-study comparison, but is also exible enough to support most kinds of IIR studies. This is achieved through the use of a \pluggable" system, into which any web-based IIR interface can be embedded. The framework has been implemented and the software will be made available to reduce the resource commitment required for IIR studies.
Original languageEnglish
Publication statusPublished - 2013
EventConference & Labs of the Evaluation Forum (CLEF) - Valencia, Spain
Duration: 23 Sep 201326 Sep 2013

Conference

ConferenceConference & Labs of the Evaluation Forum (CLEF)
CountrySpain
CityValencia
Period23/09/1326/09/13

Fingerprint

Information retrieval
Information retrieval systems

Cite this

Hall, M., & Toms, E. (2013). Building a Common Framework for IIR Evaluation. Paper presented at Conference & Labs of the Evaluation Forum (CLEF), Valencia, Spain.
Hall, Mark ; Toms, Elaine. / Building a Common Framework for IIR Evaluation. Paper presented at Conference & Labs of the Evaluation Forum (CLEF), Valencia, Spain.
@conference{36e3cd9bf5e94578aadb451f435485d2,
title = "Building a Common Framework for IIR Evaluation",
abstract = "Cranfield-style evaluations standardised Information Retrieval (IR) evaluation practices, enabling the creation of programmes such as TREC, CLEF, and INEX, and long-term comparability of IR systems. However, the methodology does not translate well into the Interactive IR (IIR) domain, where the inclusion of the user into the search process and the repeated interaction between user and system creates more variability than the Cranfield-style evaluations can support. As a result, IIR evaluations of various systems have tended to be non-comparable, not because the systems vary, but because the methodologies used are non-comparable. In this paper we describe a standardised IIR evaluation framework, that ensures that IIR evaluations can share a standardised baseline methodology in much the same way that TREC, CLEF, and INEX imposed a process on IR evaluation. The framework provides a common baseline, derived by integrating existing, validated evaluation measures, that enables inter-study comparison, but is also exible enough to support most kinds of IIR studies. This is achieved through the use of a \pluggable{"} system, into which any web-based IIR interface can be embedded. The framework has been implemented and the software will be made available to reduce the resource commitment required for IIR studies.",
author = "Mark Hall and Elaine Toms",
year = "2013",
language = "English",
note = "Conference & Labs of the Evaluation Forum (CLEF) ; Conference date: 23-09-2013 Through 26-09-2013",

}

Hall, M & Toms, E 2013, 'Building a Common Framework for IIR Evaluation' Paper presented at Conference & Labs of the Evaluation Forum (CLEF), Valencia, Spain, 23/09/13 - 26/09/13, .

Building a Common Framework for IIR Evaluation. / Hall, Mark; Toms, Elaine.

2013. Paper presented at Conference & Labs of the Evaluation Forum (CLEF), Valencia, Spain.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Building a Common Framework for IIR Evaluation

AU - Hall, Mark

AU - Toms, Elaine

PY - 2013

Y1 - 2013

N2 - Cranfield-style evaluations standardised Information Retrieval (IR) evaluation practices, enabling the creation of programmes such as TREC, CLEF, and INEX, and long-term comparability of IR systems. However, the methodology does not translate well into the Interactive IR (IIR) domain, where the inclusion of the user into the search process and the repeated interaction between user and system creates more variability than the Cranfield-style evaluations can support. As a result, IIR evaluations of various systems have tended to be non-comparable, not because the systems vary, but because the methodologies used are non-comparable. In this paper we describe a standardised IIR evaluation framework, that ensures that IIR evaluations can share a standardised baseline methodology in much the same way that TREC, CLEF, and INEX imposed a process on IR evaluation. The framework provides a common baseline, derived by integrating existing, validated evaluation measures, that enables inter-study comparison, but is also exible enough to support most kinds of IIR studies. This is achieved through the use of a \pluggable" system, into which any web-based IIR interface can be embedded. The framework has been implemented and the software will be made available to reduce the resource commitment required for IIR studies.

AB - Cranfield-style evaluations standardised Information Retrieval (IR) evaluation practices, enabling the creation of programmes such as TREC, CLEF, and INEX, and long-term comparability of IR systems. However, the methodology does not translate well into the Interactive IR (IIR) domain, where the inclusion of the user into the search process and the repeated interaction between user and system creates more variability than the Cranfield-style evaluations can support. As a result, IIR evaluations of various systems have tended to be non-comparable, not because the systems vary, but because the methodologies used are non-comparable. In this paper we describe a standardised IIR evaluation framework, that ensures that IIR evaluations can share a standardised baseline methodology in much the same way that TREC, CLEF, and INEX imposed a process on IR evaluation. The framework provides a common baseline, derived by integrating existing, validated evaluation measures, that enables inter-study comparison, but is also exible enough to support most kinds of IIR studies. This is achieved through the use of a \pluggable" system, into which any web-based IIR interface can be embedded. The framework has been implemented and the software will be made available to reduce the resource commitment required for IIR studies.

UR - http://www.clef2013.org/

M3 - Paper

ER -

Hall M, Toms E. Building a Common Framework for IIR Evaluation. 2013. Paper presented at Conference & Labs of the Evaluation Forum (CLEF), Valencia, Spain.