TY - JOUR
T1 - Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts
AU - Korkontzelos, Ioannis
AU - Nikfarjam, Azadeh
AU - Shardlow, Matthew
AU - Sarker, Abeed
AU - Ananiadou, Sophia
AU - Gonzalez, Graciela H.
N1 - M. Pirmohamed, S. James, S. Meakin, C. Green, A.K. Scott, T.J. Walley, K. Farrar, B.K. Park, A.M. Breckenridge
Adverse drug reactions as cause of admission to hospital: prospective analysis of 18 820 patients
BMJ, 329 (7456) (2004), pp. 15–19 http://dx.doi.org/10.1136/bmj.329.7456.15
L. Hazell, S. Shakir
Under-reporting of adverse drug reactions
Drug Saf., 29 (5) (2006), pp. 385–396 http://dx.doi.org/10.2165/00002018-200629050-00003
V. Curcin, M. Ghanem, M. Molokhia, Y. Guo, J. Darlington
Mining adverse drug reactions with e-science workflows
Biomedical Engineering Conference, 2008. CIBEC 2008, Cairo International (2008), pp. 1–5 http://dx.doi.org/10.1109/CIBEC.2008.4786100
J.D. Lewis, R. Schinnar, W.B. Bilker, X. Wang, B.L. Strom
Validation studies of the health improvement network (thin) database for pharmacoepidemiology research
Pharmacoepidemiol. Drug Saf., 16 (4) (2007), pp. 393–401 http://dx.doi.org/10.1002/pds.1335
A. Nikfarjam, A. Sarker, K. O’Connor, R. Ginn, G. Gonzalez
Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features
J. Am. Med Inform. Assoc. (2015) http://dx.doi.org/10.1093/jamia/ocu041
R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, G. Gonzalez
Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 117–125
A. Nikfarjam, G.H. Gonzalez
Pattern mining for extraction of mentions of adverse drug reactions from user comments
AMIA Annual Symposium Proceedings/AMIA Symposium, 2011 (2011), pp. 1019–1026
A. Yates, N. Goharian
Adrtrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites
,in: P. Serdyukov, P. Braslavski, S. Kuznetsov, J. Kamps, S. Rger, E. Agichtein, I. Segalovich, E. Yilmaz (Eds.), Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 7814, , Springer, Berlin Heidelberg (2013), pp. 816–819 http://dx.doi.org/10.1007/978-3-642-36973-5_92
C. Freifeld, J. Brownstein, C. Menone, W. Bao, R. Filice, T. Kass-Hout, N. Dasgupta
Digital drug safety surveillance: monitoring pharmaceutical products in twitter
Drug Saf., 37 (5) (2014), pp. 343–350 http://dx.doi.org/10.1007/s40264-014-0155-x
K. O’Connor, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K.L. Smith, G. Gonzalez
Pharmacovigilance on twitter? Mining tweets for adverse drug reactions
AMIA Annual Symposium Proceedings/AMIA Symposium, 2014 (2014), pp. 924–933
H. Sampathkumar, X.-W. Chen, B. Luo
Mining adverse drug reactions from online healthcare forums using hidden markov model
BMC Med. Inform. Decis. Making, 14 (2014), p. 91 http://dx.doi.org/10.1186/1472-6947-14-91
A. Sarker, R. Ginn, A. Nikfarjam, K. OConnor, K. Smith, S. Jayaraman, T. Upadhaya, G. Gonzalez
Utilizing social media data for pharmacovigilance: a review
J. Biomed. Inform., 54 (2015), pp. 202–212 http://dx.doi.org/10.1016/j.jbi.2015.02.004
A. Sarker, G. Gonzalez
Portable automatic text classification for adverse drug reaction detection via multi-corpus training
J. Biomed. Inform., 53 (2015), pp. 196–207 http://dx.doi.org/10.1016/j.jbi.2014.11.002
B. Liu
Sentiment Analysis and Opinion Mining, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, San Rafael (2012)
B. Pang, L. Lee
Opinion mining and sentiment analysis
Found. Trends Inform. Retr., 2 (1–2) (2008), pp. 1–135 http://dx.doi.org/10.1561/1500000011
M. Taboada, J. Brooke, M. Tofiloski, K. Voll, M. Stede
Lexicon-based methods for sentiment analysis
Comput. Linguist., 37 (2) (2011), pp. 267–307 http://dx.doi.org/10.1162/COLI_a_00049
M. Hu, B. Liu
Mining opinion features in customer reviews
Proceedings of the 19th National Conference on Artificial Intelligence, AAAI’04, AAAI Press (2004), pp. 755–760
E. Riloff, J. Wiebe
Learning extraction patterns for subjective expressions
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP ’03, Association for Computational Linguistics, Stroudsburg, PA, USA (2003), pp. 105–112 http://dx.doi.org/10.3115/1119355.1119369
S.M. Mohammad, P.D. Turney
Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon
Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, CAAGET ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 26–34
N. Kaji, M. Kitsuregawa
Building lexicon for sentiment analysis from massive collection of HTML documents
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Association for Computational Linguistics, Prague, Czech Republic (2007), pp. 1075–1083
S.M. Mohammad, S. Kiritchenko, X. Zhu, NRC-Canada: Building the State-of-the-art in Sentiment Analysis of Tweets. Available from: <1308.6242>.
A. Go, R. Bhayani, L. Huang
Twitter sentiment classification using distant supervision
Processing (2009), pp. 1–6
M. Taboada, C. Anthony, K. Voll
Methods for creating semantic orientation dictionaries
Conference on Language Resources and Evaluation (LREC) (2006), pp. 427–432
A. Abbasi, H. Chen, A. Salem
Sentiment analysis in multiple languages: feature selection for opinion classification in web forums
ACM Trans. Inform. Syst., 26 (3) (2008), pp. 12:1–12:34 http://dx.doi.org/10.1145/1361684.1361685
T. Wilson, J. Wiebe, P. Hoffmann
Recognizing contextual polarity in phrase-level sentiment analysis
Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, Association for Computational Linguistics, Stroudsburg, PA, USA (2005), pp. 347–354 http://dx.doi.org/10.3115/1220575.1220619
W. Medhat, A. Hassan, H. Korashy
Sentiment analysis algorithms and applications: a survey
Ain Shams Eng. J., 5 (4) (2014), pp. 1093–1113
K. Ravi, V. Ravi
A survey on opinion mining and sentiment analysis: tasks, approaches and applications
Knowl.-Based Syst., 89 (C) (2015), pp. 14–46 http://dx.doi.org/10.1016/j.knosys.2015.06.015
S.K. Yadav
Sentiment analysis and classification: a survey
Int. J. Adv. Res. Comput. Sci. Manage. Stud., 3 (3) (2015), pp. 113–121
X. Ji, S.A. Chun, J. Geller
Monitoring public health concerns using twitter sentiment classifications
Proceedings of the 2013 IEEE International Conference on Healthcare Informatics, ICHI ’13, IEEE Computer Society, Washington, DC, USA (2013), pp. 335–344 http://dx.doi.org/10.1109/ICHI.2013.47
F. Greaves, D. Ramirez-Cano, C. Millett, A. Darzi, L. Donaldson
Use of sentiment analysis for capturing patient experience from free-text comments posted online
J. Med. Int. Res., 15 (11) (2013), p. e239 http://dx.doi.org/10.2196/jmir.2721
B.W. Chee, R. Berlin, B. Schatz
Predicting adverse drug events from personal health messages
AMIA Annual Symposium proceedings/AMIA Symposium, 2011 (2011), pp. 217–226
Y. Sha, J. Yan, G. Cai
Detecting public sentiment over PM2.5 pollution hazards through analysis of Chinese microblog
ISCRAM: The 11th International Conference on Information Systems for Crisis Response and Management (2014), pp. 722–726
J.C. Eichstaedt, H.A. Schwartz, M.L. Kern, G. Park, D.R. Labarthe, R.M. Merchant, S. Jha, M. Agrawal, L.A. Dziurzynski, M. Sap, C. Weeg, E.E. Larson, L.H. Ungar, M.E.P. Seligman
Psychological language on twitter predicts county-level heart disease mortality
Psychol. Sci., 26 (2) (2015), pp. 159–169 http://dx.doi.org/10.1177/0956797614557867
H. Sharif, A. Abbasi, F. Zafar, D. Zimbra
Detecting adverse drug reactions using a sentiment classification framework
Proceedings of the Sixth ASE International Conference on Social Computing (SocialCom), Stanford, California (2014), pp. 1–10
A. Patki, A. Sarker, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K. OConnor, K. Smith, G. Gonzalez
Mining adverse drug reaction signals from social media: going beyond extraction
Proceedings of BioLink Special Interest Group 2014 (2014)
R. Ginn, P. Pimpalkhute, A. Nikfarjam, A. Patki, K. OConnor, A. Sarker, K. Smith, G. Gonzalez
Mining twitter for adverse drug reaction mentions: a corpus and classification benchmark
Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (BioTxtM) (2014)
J. Cohen
A coefficient of agreement for nominal scales
Educ. Psychol. Meas., 20 (1) (1960), pp. 37–46 http://dx.doi.org/10.1177/001316446002000104
X. Zhou, X. Zhang, X. Hu
Dragon toolkit: incorporating auto-learned semantic knowledge into large-scale text retrieval and mining
Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence – ICTAI ’07, vol. 02, IEEE Computer Society, Washington, DC, USA (2007), pp. 197–201 http://dx.doi.org/10.1109/ICTAI.2007.90
A. Nikfarjam, E. Emadzadeh, G. Gonzalez
A hybrid system for emotion extraction from suicide notes
Biomed. Inform. Insights, 5 (2012), pp. 165–174 http://dx.doi.org/10.4137/BII.S8981
T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space. Available from: <1301.3781>.
P. Nakov, S. Rosenthal, Z. Kozareva, V. Stoyanov, A. Ritter, T. Wilson
Semeval-2013 task 2: sentiment analysis in twitter
Second Joint Conference on Lexical and Computational Semantics (∗SEM), Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol. 2, Association for Computational Linguistics, Atlanta, Georgia, USA (2013), pp. 312–320
S. Rosenthal, A. Ritter, P. Nakov, V. Stoyanov
Semeval-2014 task 9: sentiment analysis in twitter
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Association for Computational Linguistics and Dublin City University, Dublin, Ireland (2014), pp. 73–80
B. Pang, L. Lee, S. Vaithyanathan
Thumbs up?: Sentiment classification using machine learning techniques
Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing – EMNLP ’02, vol. 10, Association for Computational Linguistics, Stroudsburg, PA, USA (2002), pp. 79–86 http://dx.doi.org/10.3115/1118693.1118704
O. Owoputi, C. Dyer, K. Gimpel, N. Schneider
Part-of-speech Tagging for Twitter: Word Clusters and Other Advances, Tech. Rep. CMU-ML-12-107
Machine Learning Department, Carnegie Mellon University (2012)
R. Plutchik
Emotions: a general psychoevolutionary theory
K.R. Scherer, P. Ekman (Eds.), Approaches to Emotion, Lawrence Erlbaum, Hillsdale, N.J. (1984), pp. 197–219
PY - 2016/8/1
Y1 - 2016/8/1
N2 - Objective
The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions.
Methods
We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions.
Results
Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications.
Conclusion
This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.
AB - Objective
The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions.
Methods
We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions.
Results
Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications.
Conclusion
This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.
KW - Adverse drug reactionsSocial mediaSentiment analysisText mining
KW - Text mining
KW - Sentiment analysis
KW - Social media
KW - Adverse drug reactions
KW - Public Health
KW - Social Media
KW - Drug-Related Side Effects and Adverse Reactions
KW - Humans
KW - Internet
KW - Pharmacovigilance
UR - http://www.journals.elsevier.com/journal-of-biomedical-informatics/
UR - http://www.scopus.com/inward/record.url?scp=84978034203&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978034203&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/c21c125a-16c8-3650-bb94-e45f412613ce/
U2 - http://dx.doi.org/10.1016/j.jbi.2016.06.007
DO - http://dx.doi.org/10.1016/j.jbi.2016.06.007
M3 - Article
C2 - 27363901
VL - 62
SP - 148
EP - 158
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
SN - 1532-0464
ER -