Evaluating the Use of Clustering for Automatically Organising Digital Library Collections

Mark Hall, Paul Clough, Mark Stevenson

Research output: Contribution to conferencePaper

6 Citations (Scopus)
11 Downloads (Pure)

Abstract

Large digital libraries have become available over the past years through digitisation and aggregation projects. These large collections present a challenge to the new user who wishes to discover what is available in the collections. Subject classification can help in this task, however in large collections it is frequently incomplete or inconsistent. Automatic clustering algorithms provide a solution to this, however the question remains whether they produce clusters that are sufficiently cohesive and distinct for them to be used in supporting discovery and exploration in digital libraries. In this paper we present a novel approach to investigating cluster cohesion that is based on identifying instruders in a cluster. The results from a human-subject experiment show that clustering algorithms produce clusters that are sufficiently cohesive to be used where no (consistent) manual classification exists.
Original languageEnglish
Pages323-334
DOIs
Publication statusPublished - 2012
EventTheory and Practice of Digital Libraries - Paphos, Cyprus
Duration: 23 Sep 201227 Sep 2012

Conference

ConferenceTheory and Practice of Digital Libraries
CountryCyprus
CityPaphos
Period23/09/1227/09/12

Fingerprint

Digital libraries
Clustering algorithms
Analog to digital conversion
Agglomeration
Experiments

Cite this

Hall, M., Clough, P., & Stevenson, M. (2012). Evaluating the Use of Clustering for Automatically Organising Digital Library Collections. 323-334. Paper presented at Theory and Practice of Digital Libraries, Paphos, Cyprus. https://doi.org/10.1007/978-3-642-33290-6_35
Hall, Mark ; Clough, Paul ; Stevenson, Mark. / Evaluating the Use of Clustering for Automatically Organising Digital Library Collections. Paper presented at Theory and Practice of Digital Libraries, Paphos, Cyprus.
@conference{eeb53ec801534665ab4ebbb7b4961a3d,
title = "Evaluating the Use of Clustering for Automatically Organising Digital Library Collections",
abstract = "Large digital libraries have become available over the past years through digitisation and aggregation projects. These large collections present a challenge to the new user who wishes to discover what is available in the collections. Subject classification can help in this task, however in large collections it is frequently incomplete or inconsistent. Automatic clustering algorithms provide a solution to this, however the question remains whether they produce clusters that are sufficiently cohesive and distinct for them to be used in supporting discovery and exploration in digital libraries. In this paper we present a novel approach to investigating cluster cohesion that is based on identifying instruders in a cluster. The results from a human-subject experiment show that clustering algorithms produce clusters that are sufficiently cohesive to be used where no (consistent) manual classification exists.",
author = "Mark Hall and Paul Clough and Mark Stevenson",
note = "Second International Conference, TPDL 2012, Paphos, Cyprus, September 23-27, 2012. Proceedings; Theory and Practice of Digital Libraries ; Conference date: 23-09-2012 Through 27-09-2012",
year = "2012",
doi = "10.1007/978-3-642-33290-6_35",
language = "English",
pages = "323--334",

}

Hall, M, Clough, P & Stevenson, M 2012, 'Evaluating the Use of Clustering for Automatically Organising Digital Library Collections' Paper presented at Theory and Practice of Digital Libraries, Paphos, Cyprus, 23/09/12 - 27/09/12, pp. 323-334. https://doi.org/10.1007/978-3-642-33290-6_35

Evaluating the Use of Clustering for Automatically Organising Digital Library Collections. / Hall, Mark; Clough, Paul; Stevenson, Mark.

2012. 323-334 Paper presented at Theory and Practice of Digital Libraries, Paphos, Cyprus.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Evaluating the Use of Clustering for Automatically Organising Digital Library Collections

AU - Hall, Mark

AU - Clough, Paul

AU - Stevenson, Mark

N1 - Second International Conference, TPDL 2012, Paphos, Cyprus, September 23-27, 2012. Proceedings

PY - 2012

Y1 - 2012

N2 - Large digital libraries have become available over the past years through digitisation and aggregation projects. These large collections present a challenge to the new user who wishes to discover what is available in the collections. Subject classification can help in this task, however in large collections it is frequently incomplete or inconsistent. Automatic clustering algorithms provide a solution to this, however the question remains whether they produce clusters that are sufficiently cohesive and distinct for them to be used in supporting discovery and exploration in digital libraries. In this paper we present a novel approach to investigating cluster cohesion that is based on identifying instruders in a cluster. The results from a human-subject experiment show that clustering algorithms produce clusters that are sufficiently cohesive to be used where no (consistent) manual classification exists.

AB - Large digital libraries have become available over the past years through digitisation and aggregation projects. These large collections present a challenge to the new user who wishes to discover what is available in the collections. Subject classification can help in this task, however in large collections it is frequently incomplete or inconsistent. Automatic clustering algorithms provide a solution to this, however the question remains whether they produce clusters that are sufficiently cohesive and distinct for them to be used in supporting discovery and exploration in digital libraries. In this paper we present a novel approach to investigating cluster cohesion that is based on identifying instruders in a cluster. The results from a human-subject experiment show that clustering algorithms produce clusters that are sufficiently cohesive to be used where no (consistent) manual classification exists.

U2 - 10.1007/978-3-642-33290-6_35

DO - 10.1007/978-3-642-33290-6_35

M3 - Paper

SP - 323

EP - 334

ER -

Hall M, Clough P, Stevenson M. Evaluating the Use of Clustering for Automatically Organising Digital Library Collections. 2012. Paper presented at Theory and Practice of Digital Libraries, Paphos, Cyprus. https://doi.org/10.1007/978-3-642-33290-6_35