Automatic language ability assessment method based on natural language processing

Nonso Nnamoko, Themis Karaminis, Jack Procter, Joseph Barrowclough, Ioannis Korkontzelos

Research output: Contribution to journalArticle (journal)peer-review

6 Downloads (Pure)

Abstract

Background and Objectives:
The Wechsler Abbreviated Scales of Intelligence second edition (WASI-II) is a standardised assessment tool that is widely used to assess cognitive ability in clinical, research, and educational settings. In one of the components of this assessment, referred to as the Vocabulary task, the assessed individuals are presented with words (called stimulus items), and asked to explain what each word mean. Their responses are hand-scored based on a list of pre-rated sample responses [0-Point (poor), 1-Point (moderate), or 2-Point (excellent)] that is provided in the accompanying manual of WASI-II. This scoring method is time-consuming, and scoring of responses that do not fully match the pre-rated ones may vary between individual scorers. In this study, we aim to use natural language processing techniques to automate the scoring procedure and make it more time-efficient and reliable (objective).

Methods:
Utilising five different word embeddings (Word2vec, Global Vectors, Bidirectional Encoder Representations from Transformers, Generative Pre-trained Transformer 2, and Embeddings from Language Model), we transformed stimulus items and pre-rated responses from the WASI-II Vocabulary task into machine-readable vectors. We measured distance with cosine similarity, evaluating each model against a rational-expectations hypothesis that vector representations for stimuli should align closely with 2-Point responses and diverge from 0-Point responses. Assessment involved frequency of consistent representation and the Pearson correlation coefficient, examining overall consistency with the manual’s ranking across all items and sample responses.

Results:
The Word2vec model showed the highest consistency with the WASI-II manual (frequency = 20 out of 27; Pearson Correlation coefficient = 0.61) while Bidirectional Encoder Representations from Transformers was the worst performing model (frequency = 5; Pearson Correlation coefficient = 0.05). The consistency of these two models with the WASI-II manual differed significantly, Z = 2.282, p = 0.022.

Conclusions:
Our results showed that the scoring of the WASI-II Vocabulary task can be automated with moderate accuracy relying upon off-the-shelf embedding models. These results are promising, and could be improved further by considering alternative vector dimensions, similarity metrics, and data preprocessing techniques to those used in this study.
Original languageEnglish
Article number100094
JournalNatural Language Processing Journal
Volume8
Early online date6 Aug 2024
DOIs
Publication statusPublished - 30 Sept 2024

Keywords

  • Cognitive assessment
  • Natural Language Processing
  • Language ability test
  • Cosine similarity
  • WASI-II
  • Word embedding

Research Centres

  • Data and Complex Systems Research Centre

Fingerprint

Dive into the research topics of 'Automatic language ability assessment method based on natural language processing'. Together they form a unique fingerprint.

Cite this