Building a Common Framework for IIR Evaluation

Mark Hall, Elaine Toms

Research output: Contribution to conferencePaperpeer-review

25 Citations (Scopus)
162 Downloads (Pure)


Cranfield-style evaluations standardised Information Retrieval (IR) evaluation practices, enabling the creation of programmes such as TREC, CLEF, and INEX, and long-term comparability of IR systems. However, the methodology does not translate well into the Interactive IR (IIR) domain, where the inclusion of the user into the search process and the repeated interaction between user and system creates more variability than the Cranfield-style evaluations can support. As a result, IIR evaluations of various systems have tended to be non-comparable, not because the systems vary, but because the methodologies used are non-comparable. In this paper we describe a standardised IIR evaluation framework, that ensures that IIR evaluations can share a standardised baseline methodology in much the same way that TREC, CLEF, and INEX imposed a process on IR evaluation. The framework provides a common baseline, derived by integrating existing, validated evaluation measures, that enables inter-study comparison, but is also exible enough to support most kinds of IIR studies. This is achieved through the use of a \pluggable" system, into which any web-based IIR interface can be embedded. The framework has been implemented and the software will be made available to reduce the resource commitment required for IIR studies.
Original languageEnglish
Publication statusPublished - 2013
EventConference & Labs of the Evaluation Forum (CLEF) - Valencia, Spain
Duration: 23 Sept 201326 Sept 2013


ConferenceConference & Labs of the Evaluation Forum (CLEF)


Dive into the research topics of 'Building a Common Framework for IIR Evaluation'. Together they form a unique fingerprint.

Cite this