TY - JOUR
T1 - Classifying emotions in Stack Overflow and JIRA using a multi-label approach
AU - CABRERA DIEGO, LUIS ADRIAN
AU - BESSIS, NIKOLAOS
AU - KORKONTZELOS, YANNIS
N1 - Funding Information:
This research has been carried out as part of the CROSSMINER Project, which has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement No. 732223.
Funding Information:
This research has been carried out as part of the CROSSMINER Project , which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No . 732223 .
Publisher Copyright:
© 2020 The Authors
PY - 2020/5/11
Y1 - 2020/5/11
N2 - A forum or social media post can express multiple emotions, such as love, joy or anger. Emotion classification has been proven useful for measuring aspects such as user satisfaction. Despite its usefulness, research in emotion classification is limited, because the task is multi-label and publicly available data sets and lexica are very limited. A number of emotion classifiers for general-domain text have been proposed recently, but only a few for text in the domain of Open Source Software (OSS), such as EmoTxt. In this paper, we explore different lexica and two multi-label algorithms for classifying emotions in text related to OSS. We trained various multi-label classifiers using HOMER and RAkEL on a data set of Stack Overflow posts and a data set of JIRA Issue Tracker comments. The classifiers have been enriched with features derived from different state-of-the-art lexica. We achieved multi-label Micro F-scores up to 0.811 and Subset 0/1 Loss of 0.290. These results represent a statistically significant improvement over the state-of-the-art.
AB - A forum or social media post can express multiple emotions, such as love, joy or anger. Emotion classification has been proven useful for measuring aspects such as user satisfaction. Despite its usefulness, research in emotion classification is limited, because the task is multi-label and publicly available data sets and lexica are very limited. A number of emotion classifiers for general-domain text have been proposed recently, but only a few for text in the domain of Open Source Software (OSS), such as EmoTxt. In this paper, we explore different lexica and two multi-label algorithms for classifying emotions in text related to OSS. We trained various multi-label classifiers using HOMER and RAkEL on a data set of Stack Overflow posts and a data set of JIRA Issue Tracker comments. The classifiers have been enriched with features derived from different state-of-the-art lexica. We achieved multi-label Micro F-scores up to 0.811 and Subset 0/1 Loss of 0.290. These results represent a statistically significant improvement over the state-of-the-art.
KW - Multi-label classification
KW - Emotion classification
KW - Stack Overflow
KW - Jira Issue Tracker
UR - http://www.scopus.com/inward/record.url?scp=85080052268&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85080052268&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/4d513c2d-5b41-302b-94a8-1b092df19a0e/
U2 - 10.1016/j.knosys.2020.105633
DO - 10.1016/j.knosys.2020.105633
M3 - Article (journal)
SN - 0950-7051
VL - 195
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
IS - 105633
M1 - 105633
ER -