Projects per year
Abstract
Sentiment analysis methods co-ordinate text mining components, such as sentence splitters, tokenisers and classifiers, into pipelined applications to automatically analyse the emotions or sentiment expressed in textual content. However, the performance of sentiment analysis pipelines is known to be substantially affected by the constituent components. In this paper, we leverage the Unstructured Information Management Architecture (UIMA) to seamlessly co ordinate components into sentiment analysis pipelines. We then evaluate a wide range of different combinations of text mining components to identify optimal settings. More specifically, we evaluate different pre-processing components, e.g. tokenisers and stemmers, feature weighting schemes, e.g. TF and TFIDF, feature types, e.g. bigrams, trigrams and bigrams+trigrams, and classification algorithms, e.g. Support Vector Machines, Random Forest and Naive Bayes, against 6 publicly available datasets. The results demonstrate that optimal configurations are consistent across the 6 datasets while our UIMA-based pipeline yields a robust performance when compared to baseline methods.
Original language | English |
---|---|
Title of host publication | Natural Language Processing and Information Systems. NLDB 2019. |
Editors | E Metais, F Meziane, S Vadera, V Sugumaran, M Saraee |
Publisher | Springer |
Pages | 286 |
Number of pages | 294 |
Volume | 11608 |
ISBN (Electronic) | 978-3-030-23281-8 |
ISBN (Print) | 978303023281 |
DOIs | |
Publication status | Published - 21 Jun 2019 |
Event | NLDB 2019: Natural Language Processing and Information Systems - University of Salford, Salford, United Kingdom Duration: 26 Jun 2019 → 28 Jun 2019 http://usir.salford.ac.uk/id/eprint/51593/ |
Publication series
Name | Lecture Notes in Computer Science |
---|
Conference
Conference | NLDB 2019: Natural Language Processing and Information Systems |
---|---|
Abbreviated title | NLDB 2019 |
Country/Territory | United Kingdom |
City | Salford |
Period | 26/06/19 → 28/06/19 |
Internet address |
Keywords
- sentiment analysis
- text processing
- interoperability
- UIMA
Fingerprint
Dive into the research topics of 'Evaluating the Accuracy and Efficiency of Sentiment Analysis Pipelines with UIMA'. Together they form a unique fingerprint.Projects
- 1 Finished
-
TYPHON: Polyglot and Hybrid Persistence Architectures for Big Data Analytics
KORKONTZELOS, Y. (PI) & Bessis, N. (CoI)
1/01/18 → 31/12/20
Project: Research