Evaluating the Accuracy and Efficiency of Sentiment Analysis Pipelines with UIMA

NABEELA ALTRABSHEH, GEORGIOS KONTONATSIOS, YANNIS KORKONTZELOS

Research output: Chapter in Book/Report/Conference proceedingConference proceeding (ISBN)peer-review

7 Downloads (Pure)

Abstract

Sentiment analysis methods co-ordinate text mining components, such as sentence splitters, tokenisers and classifiers, into pipelined applications to automatically analyse the emotions or sentiment expressed in textual content. However, the performance of sentiment analysis pipelines is known to be substantially affected by the constituent components. In this paper, we leverage the Unstructured Information Management Architecture (UIMA) to seamlessly co ordinate components into sentiment analysis pipelines. We then evaluate a wide range of different combinations of text mining components to identify optimal settings. More specifically, we evaluate different pre-processing components, e.g. tokenisers and stemmers, feature weighting schemes, e.g. TF and TFIDF, feature types, e.g. bigrams, trigrams and bigrams+trigrams, and classification algorithms, e.g. Support Vector Machines, Random Forest and Naive Bayes, against 6 publicly available datasets. The results demonstrate that optimal configurations are consistent across the 6 datasets while our UIMA-based pipeline yields a robust performance when compared to baseline methods.
Original languageEnglish
Title of host publicationNatural Language Processing and Information Systems. NLDB 2019.
EditorsE Metais, F Meziane, S Vadera, V Sugumaran, M Saraee
Publisherspringer
Pages286
Number of pages294
Volume11608
ISBN (Electronic)978-3-030-23281-8
ISBN (Print)978303023281
DOIs
Publication statusPublished - 21 Jun 2019
EventNLDB 2019: Natural Language Processing and Information Systems - University of Salford, Salford, United Kingdom
Duration: 26 Jun 201928 Jun 2019
http://usir.salford.ac.uk/id/eprint/51593/

Publication series

NameLecture Notes in Computer Science

Conference

ConferenceNLDB 2019: Natural Language Processing and Information Systems
Abbreviated titleNLDB 2019
Country/TerritoryUnited Kingdom
CitySalford
Period26/06/1928/06/19
Internet address

Keywords

  • sentiment analysis
  • text processing
  • interoperability
  • UIMA

Fingerprint

Dive into the research topics of 'Evaluating the Accuracy and Efficiency of Sentiment Analysis Pipelines with UIMA'. Together they form a unique fingerprint.

Cite this