Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

Ioannis Korkontzelos, Azadeh Nikfarjam, Matthew Shardlow, Abeed Sarker, Sophia Ananiadou, Graciela H. Gonzalez

Research output: Contribution to journalArticle

45 Citations (Scopus)

Abstract

Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.
Original languageEnglish
Pages (from-to)148-158
JournalJournal of Biomedical Informatics
Volume62
Early online date27 Jun 2016
DOIs
Publication statusE-pub ahead of print - 27 Jun 2016

Fingerprint

Drug-Related Side Effects and Adverse Reactions
Health
Public health
Social Media
Pharmacovigilance
Intuition
Public Opinion
Public Health

Keywords

  • Adverse drug reactions Social media Sentiment analysis Text mining

Cite this

Korkontzelos, Ioannis ; Nikfarjam, Azadeh ; Shardlow, Matthew ; Sarker, Abeed ; Ananiadou, Sophia ; Gonzalez, Graciela H. / Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. In: Journal of Biomedical Informatics. 2016 ; Vol. 62. pp. 148-158.
@article{2731dba0c1c8426a9bb26a8d7a4c7023,
title = "Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts",
abstract = "Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14{\%} to 73.22{\%} in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57{\%} to 80.14{\%}, and in the Twitter part of the corpus, from 66.91{\%} to 69.16{\%}. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.",
keywords = "Adverse drug reactions Social media Sentiment analysis Text mining",
author = "Ioannis Korkontzelos and Azadeh Nikfarjam and Matthew Shardlow and Abeed Sarker and Sophia Ananiadou and Gonzalez, {Graciela H.}",
note = "M. Pirmohamed, S. James, S. Meakin, C. Green, A.K. Scott, T.J. Walley, K. Farrar, B.K. Park, A.M. Breckenridge Adverse drug reactions as cause of admission to hospital: prospective analysis of 18 820 patients BMJ, 329 (7456) (2004), pp. 15–19 http://dx.doi.org/10.1136/bmj.329.7456.15 L. Hazell, S. Shakir Under-reporting of adverse drug reactions Drug Saf., 29 (5) (2006), pp. 385–396 http://dx.doi.org/10.2165/00002018-200629050-00003 V. Curcin, M. Ghanem, M. Molokhia, Y. Guo, J. Darlington Mining adverse drug reactions with e-science workflows Biomedical Engineering Conference, 2008. CIBEC 2008, Cairo International (2008), pp. 1–5 http://dx.doi.org/10.1109/CIBEC.2008.4786100 J.D. Lewis, R. Schinnar, W.B. Bilker, X. Wang, B.L. Strom Validation studies of the health improvement network (thin) database for pharmacoepidemiology research Pharmacoepidemiol. Drug Saf., 16 (4) (2007), pp. 393–401 http://dx.doi.org/10.1002/pds.1335 A. Nikfarjam, A. Sarker, K. O’Connor, R. Ginn, G. Gonzalez Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features J. Am. Med Inform. Assoc. (2015) http://dx.doi.org/10.1093/jamia/ocu041 R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, G. Gonzalez Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 117–125 A. Nikfarjam, G.H. Gonzalez Pattern mining for extraction of mentions of adverse drug reactions from user comments AMIA Annual Symposium Proceedings/AMIA Symposium, 2011 (2011), pp. 1019–1026 A. Yates, N. Goharian Adrtrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites ,in: P. Serdyukov, P. Braslavski, S. Kuznetsov, J. Kamps, S. Rger, E. Agichtein, I. Segalovich, E. Yilmaz (Eds.), Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 7814, , Springer, Berlin Heidelberg (2013), pp. 816–819 http://dx.doi.org/10.1007/978-3-642-36973-5_92 C. Freifeld, J. Brownstein, C. Menone, W. Bao, R. Filice, T. Kass-Hout, N. Dasgupta Digital drug safety surveillance: monitoring pharmaceutical products in twitter Drug Saf., 37 (5) (2014), pp. 343–350 http://dx.doi.org/10.1007/s40264-014-0155-x K. O’Connor, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K.L. Smith, G. Gonzalez Pharmacovigilance on twitter? Mining tweets for adverse drug reactions AMIA Annual Symposium Proceedings/AMIA Symposium, 2014 (2014), pp. 924–933 H. Sampathkumar, X.-W. Chen, B. Luo Mining adverse drug reactions from online healthcare forums using hidden markov model BMC Med. Inform. Decis. Making, 14 (2014), p. 91 http://dx.doi.org/10.1186/1472-6947-14-91 A. Sarker, R. Ginn, A. Nikfarjam, K. OConnor, K. Smith, S. Jayaraman, T. Upadhaya, G. Gonzalez Utilizing social media data for pharmacovigilance: a review J. Biomed. Inform., 54 (2015), pp. 202–212 http://dx.doi.org/10.1016/j.jbi.2015.02.004 A. Sarker, G. Gonzalez Portable automatic text classification for adverse drug reaction detection via multi-corpus training J. Biomed. Inform., 53 (2015), pp. 196–207 http://dx.doi.org/10.1016/j.jbi.2014.11.002 B. Liu Sentiment Analysis and Opinion Mining, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, San Rafael (2012) B. Pang, L. Lee Opinion mining and sentiment analysis Found. Trends Inform. Retr., 2 (1–2) (2008), pp. 1–135 http://dx.doi.org/10.1561/1500000011 M. Taboada, J. Brooke, M. Tofiloski, K. Voll, M. Stede Lexicon-based methods for sentiment analysis Comput. Linguist., 37 (2) (2011), pp. 267–307 http://dx.doi.org/10.1162/COLI_a_00049 M. Hu, B. Liu Mining opinion features in customer reviews Proceedings of the 19th National Conference on Artificial Intelligence, AAAI’04, AAAI Press (2004), pp. 755–760 E. Riloff, J. Wiebe Learning extraction patterns for subjective expressions Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP ’03, Association for Computational Linguistics, Stroudsburg, PA, USA (2003), pp. 105–112 http://dx.doi.org/10.3115/1119355.1119369 S.M. Mohammad, P.D. Turney Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, CAAGET ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 26–34 N. Kaji, M. Kitsuregawa Building lexicon for sentiment analysis from massive collection of HTML documents Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Association for Computational Linguistics, Prague, Czech Republic (2007), pp. 1075–1083 S.M. Mohammad, S. Kiritchenko, X. Zhu, NRC-Canada: Building the State-of-the-art in Sentiment Analysis of Tweets. Available from: <1308.6242>. A. Go, R. Bhayani, L. Huang Twitter sentiment classification using distant supervision Processing (2009), pp. 1–6 M. Taboada, C. Anthony, K. Voll Methods for creating semantic orientation dictionaries Conference on Language Resources and Evaluation (LREC) (2006), pp. 427–432 A. Abbasi, H. Chen, A. Salem Sentiment analysis in multiple languages: feature selection for opinion classification in web forums ACM Trans. Inform. Syst., 26 (3) (2008), pp. 12:1–12:34 http://dx.doi.org/10.1145/1361684.1361685 T. Wilson, J. Wiebe, P. Hoffmann Recognizing contextual polarity in phrase-level sentiment analysis Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, Association for Computational Linguistics, Stroudsburg, PA, USA (2005), pp. 347–354 http://dx.doi.org/10.3115/1220575.1220619 W. Medhat, A. Hassan, H. Korashy Sentiment analysis algorithms and applications: a survey Ain Shams Eng. J., 5 (4) (2014), pp. 1093–1113 K. Ravi, V. Ravi A survey on opinion mining and sentiment analysis: tasks, approaches and applications Knowl.-Based Syst., 89 (C) (2015), pp. 14–46 http://dx.doi.org/10.1016/j.knosys.2015.06.015 S.K. Yadav Sentiment analysis and classification: a survey Int. J. Adv. Res. Comput. Sci. Manage. Stud., 3 (3) (2015), pp. 113–121 X. Ji, S.A. Chun, J. Geller Monitoring public health concerns using twitter sentiment classifications Proceedings of the 2013 IEEE International Conference on Healthcare Informatics, ICHI ’13, IEEE Computer Society, Washington, DC, USA (2013), pp. 335–344 http://dx.doi.org/10.1109/ICHI.2013.47 F. Greaves, D. Ramirez-Cano, C. Millett, A. Darzi, L. Donaldson Use of sentiment analysis for capturing patient experience from free-text comments posted online J. Med. Int. Res., 15 (11) (2013), p. e239 http://dx.doi.org/10.2196/jmir.2721 B.W. Chee, R. Berlin, B. Schatz Predicting adverse drug events from personal health messages AMIA Annual Symposium proceedings/AMIA Symposium, 2011 (2011), pp. 217–226 Y. Sha, J. Yan, G. Cai Detecting public sentiment over PM2.5 pollution hazards through analysis of Chinese microblog ISCRAM: The 11th International Conference on Information Systems for Crisis Response and Management (2014), pp. 722–726 J.C. Eichstaedt, H.A. Schwartz, M.L. Kern, G. Park, D.R. Labarthe, R.M. Merchant, S. Jha, M. Agrawal, L.A. Dziurzynski, M. Sap, C. Weeg, E.E. Larson, L.H. Ungar, M.E.P. Seligman Psychological language on twitter predicts county-level heart disease mortality Psychol. Sci., 26 (2) (2015), pp. 159–169 http://dx.doi.org/10.1177/0956797614557867 H. Sharif, A. Abbasi, F. Zafar, D. Zimbra Detecting adverse drug reactions using a sentiment classification framework Proceedings of the Sixth ASE International Conference on Social Computing (SocialCom), Stanford, California (2014), pp. 1–10 A. Patki, A. Sarker, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K. OConnor, K. Smith, G. Gonzalez Mining adverse drug reaction signals from social media: going beyond extraction Proceedings of BioLink Special Interest Group 2014 (2014) R. Ginn, P. Pimpalkhute, A. Nikfarjam, A. Patki, K. OConnor, A. Sarker, K. Smith, G. Gonzalez Mining twitter for adverse drug reaction mentions: a corpus and classification benchmark Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (BioTxtM) (2014) J. Cohen A coefficient of agreement for nominal scales Educ. Psychol. Meas., 20 (1) (1960), pp. 37–46 http://dx.doi.org/10.1177/001316446002000104 X. Zhou, X. Zhang, X. Hu Dragon toolkit: incorporating auto-learned semantic knowledge into large-scale text retrieval and mining Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence – ICTAI ’07, vol. 02, IEEE Computer Society, Washington, DC, USA (2007), pp. 197–201 http://dx.doi.org/10.1109/ICTAI.2007.90 A. Nikfarjam, E. Emadzadeh, G. Gonzalez A hybrid system for emotion extraction from suicide notes Biomed. Inform. Insights, 5 (2012), pp. 165–174 http://dx.doi.org/10.4137/BII.S8981 T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space. Available from: <1301.3781>. P. Nakov, S. Rosenthal, Z. Kozareva, V. Stoyanov, A. Ritter, T. Wilson Semeval-2013 task 2: sentiment analysis in twitter Second Joint Conference on Lexical and Computational Semantics (∗SEM), Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol. 2, Association for Computational Linguistics, Atlanta, Georgia, USA (2013), pp. 312–320 S. Rosenthal, A. Ritter, P. Nakov, V. Stoyanov Semeval-2014 task 9: sentiment analysis in twitter Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Association for Computational Linguistics and Dublin City University, Dublin, Ireland (2014), pp. 73–80 B. Pang, L. Lee, S. Vaithyanathan Thumbs up?: Sentiment classification using machine learning techniques Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing – EMNLP ’02, vol. 10, Association for Computational Linguistics, Stroudsburg, PA, USA (2002), pp. 79–86 http://dx.doi.org/10.3115/1118693.1118704 O. Owoputi, C. Dyer, K. Gimpel, N. Schneider Part-of-speech Tagging for Twitter: Word Clusters and Other Advances, Tech. Rep. CMU-ML-12-107 Machine Learning Department, Carnegie Mellon University (2012) R. Plutchik Emotions: a general psychoevolutionary theory K.R. Scherer, P. Ekman (Eds.), Approaches to Emotion, Lawrence Erlbaum, Hillsdale, N.J. (1984), pp. 197–219",
year = "2016",
month = "6",
day = "27",
doi = "http://dx.doi.org/10.1016/j.jbi.2016.06.007",
language = "English",
volume = "62",
pages = "148--158",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Elsevier",

}

Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. / Korkontzelos, Ioannis; Nikfarjam, Azadeh; Shardlow, Matthew; Sarker, Abeed; Ananiadou, Sophia; Gonzalez, Graciela H.

In: Journal of Biomedical Informatics, Vol. 62, 27.06.2016, p. 148-158.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

AU - Korkontzelos, Ioannis

AU - Nikfarjam, Azadeh

AU - Shardlow, Matthew

AU - Sarker, Abeed

AU - Ananiadou, Sophia

AU - Gonzalez, Graciela H.

N1 - M. Pirmohamed, S. James, S. Meakin, C. Green, A.K. Scott, T.J. Walley, K. Farrar, B.K. Park, A.M. Breckenridge Adverse drug reactions as cause of admission to hospital: prospective analysis of 18 820 patients BMJ, 329 (7456) (2004), pp. 15–19 http://dx.doi.org/10.1136/bmj.329.7456.15 L. Hazell, S. Shakir Under-reporting of adverse drug reactions Drug Saf., 29 (5) (2006), pp. 385–396 http://dx.doi.org/10.2165/00002018-200629050-00003 V. Curcin, M. Ghanem, M. Molokhia, Y. Guo, J. Darlington Mining adverse drug reactions with e-science workflows Biomedical Engineering Conference, 2008. CIBEC 2008, Cairo International (2008), pp. 1–5 http://dx.doi.org/10.1109/CIBEC.2008.4786100 J.D. Lewis, R. Schinnar, W.B. Bilker, X. Wang, B.L. Strom Validation studies of the health improvement network (thin) database for pharmacoepidemiology research Pharmacoepidemiol. Drug Saf., 16 (4) (2007), pp. 393–401 http://dx.doi.org/10.1002/pds.1335 A. Nikfarjam, A. Sarker, K. O’Connor, R. Ginn, G. Gonzalez Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features J. Am. Med Inform. Assoc. (2015) http://dx.doi.org/10.1093/jamia/ocu041 R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, G. Gonzalez Towards internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-related social networks Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, BioNLP ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 117–125 A. Nikfarjam, G.H. Gonzalez Pattern mining for extraction of mentions of adverse drug reactions from user comments AMIA Annual Symposium Proceedings/AMIA Symposium, 2011 (2011), pp. 1019–1026 A. Yates, N. Goharian Adrtrace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites ,in: P. Serdyukov, P. Braslavski, S. Kuznetsov, J. Kamps, S. Rger, E. Agichtein, I. Segalovich, E. Yilmaz (Eds.), Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 7814, , Springer, Berlin Heidelberg (2013), pp. 816–819 http://dx.doi.org/10.1007/978-3-642-36973-5_92 C. Freifeld, J. Brownstein, C. Menone, W. Bao, R. Filice, T. Kass-Hout, N. Dasgupta Digital drug safety surveillance: monitoring pharmaceutical products in twitter Drug Saf., 37 (5) (2014), pp. 343–350 http://dx.doi.org/10.1007/s40264-014-0155-x K. O’Connor, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K.L. Smith, G. Gonzalez Pharmacovigilance on twitter? Mining tweets for adverse drug reactions AMIA Annual Symposium Proceedings/AMIA Symposium, 2014 (2014), pp. 924–933 H. Sampathkumar, X.-W. Chen, B. Luo Mining adverse drug reactions from online healthcare forums using hidden markov model BMC Med. Inform. Decis. Making, 14 (2014), p. 91 http://dx.doi.org/10.1186/1472-6947-14-91 A. Sarker, R. Ginn, A. Nikfarjam, K. OConnor, K. Smith, S. Jayaraman, T. Upadhaya, G. Gonzalez Utilizing social media data for pharmacovigilance: a review J. Biomed. Inform., 54 (2015), pp. 202–212 http://dx.doi.org/10.1016/j.jbi.2015.02.004 A. Sarker, G. Gonzalez Portable automatic text classification for adverse drug reaction detection via multi-corpus training J. Biomed. Inform., 53 (2015), pp. 196–207 http://dx.doi.org/10.1016/j.jbi.2014.11.002 B. Liu Sentiment Analysis and Opinion Mining, Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, San Rafael (2012) B. Pang, L. Lee Opinion mining and sentiment analysis Found. Trends Inform. Retr., 2 (1–2) (2008), pp. 1–135 http://dx.doi.org/10.1561/1500000011 M. Taboada, J. Brooke, M. Tofiloski, K. Voll, M. Stede Lexicon-based methods for sentiment analysis Comput. Linguist., 37 (2) (2011), pp. 267–307 http://dx.doi.org/10.1162/COLI_a_00049 M. Hu, B. Liu Mining opinion features in customer reviews Proceedings of the 19th National Conference on Artificial Intelligence, AAAI’04, AAAI Press (2004), pp. 755–760 E. Riloff, J. Wiebe Learning extraction patterns for subjective expressions Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP ’03, Association for Computational Linguistics, Stroudsburg, PA, USA (2003), pp. 105–112 http://dx.doi.org/10.3115/1119355.1119369 S.M. Mohammad, P.D. Turney Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, CAAGET ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 26–34 N. Kaji, M. Kitsuregawa Building lexicon for sentiment analysis from massive collection of HTML documents Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Association for Computational Linguistics, Prague, Czech Republic (2007), pp. 1075–1083 S.M. Mohammad, S. Kiritchenko, X. Zhu, NRC-Canada: Building the State-of-the-art in Sentiment Analysis of Tweets. Available from: <1308.6242>. A. Go, R. Bhayani, L. Huang Twitter sentiment classification using distant supervision Processing (2009), pp. 1–6 M. Taboada, C. Anthony, K. Voll Methods for creating semantic orientation dictionaries Conference on Language Resources and Evaluation (LREC) (2006), pp. 427–432 A. Abbasi, H. Chen, A. Salem Sentiment analysis in multiple languages: feature selection for opinion classification in web forums ACM Trans. Inform. Syst., 26 (3) (2008), pp. 12:1–12:34 http://dx.doi.org/10.1145/1361684.1361685 T. Wilson, J. Wiebe, P. Hoffmann Recognizing contextual polarity in phrase-level sentiment analysis Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, Association for Computational Linguistics, Stroudsburg, PA, USA (2005), pp. 347–354 http://dx.doi.org/10.3115/1220575.1220619 W. Medhat, A. Hassan, H. Korashy Sentiment analysis algorithms and applications: a survey Ain Shams Eng. J., 5 (4) (2014), pp. 1093–1113 K. Ravi, V. Ravi A survey on opinion mining and sentiment analysis: tasks, approaches and applications Knowl.-Based Syst., 89 (C) (2015), pp. 14–46 http://dx.doi.org/10.1016/j.knosys.2015.06.015 S.K. Yadav Sentiment analysis and classification: a survey Int. J. Adv. Res. Comput. Sci. Manage. Stud., 3 (3) (2015), pp. 113–121 X. Ji, S.A. Chun, J. Geller Monitoring public health concerns using twitter sentiment classifications Proceedings of the 2013 IEEE International Conference on Healthcare Informatics, ICHI ’13, IEEE Computer Society, Washington, DC, USA (2013), pp. 335–344 http://dx.doi.org/10.1109/ICHI.2013.47 F. Greaves, D. Ramirez-Cano, C. Millett, A. Darzi, L. Donaldson Use of sentiment analysis for capturing patient experience from free-text comments posted online J. Med. Int. Res., 15 (11) (2013), p. e239 http://dx.doi.org/10.2196/jmir.2721 B.W. Chee, R. Berlin, B. Schatz Predicting adverse drug events from personal health messages AMIA Annual Symposium proceedings/AMIA Symposium, 2011 (2011), pp. 217–226 Y. Sha, J. Yan, G. Cai Detecting public sentiment over PM2.5 pollution hazards through analysis of Chinese microblog ISCRAM: The 11th International Conference on Information Systems for Crisis Response and Management (2014), pp. 722–726 J.C. Eichstaedt, H.A. Schwartz, M.L. Kern, G. Park, D.R. Labarthe, R.M. Merchant, S. Jha, M. Agrawal, L.A. Dziurzynski, M. Sap, C. Weeg, E.E. Larson, L.H. Ungar, M.E.P. Seligman Psychological language on twitter predicts county-level heart disease mortality Psychol. Sci., 26 (2) (2015), pp. 159–169 http://dx.doi.org/10.1177/0956797614557867 H. Sharif, A. Abbasi, F. Zafar, D. Zimbra Detecting adverse drug reactions using a sentiment classification framework Proceedings of the Sixth ASE International Conference on Social Computing (SocialCom), Stanford, California (2014), pp. 1–10 A. Patki, A. Sarker, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K. OConnor, K. Smith, G. Gonzalez Mining adverse drug reaction signals from social media: going beyond extraction Proceedings of BioLink Special Interest Group 2014 (2014) R. Ginn, P. Pimpalkhute, A. Nikfarjam, A. Patki, K. OConnor, A. Sarker, K. Smith, G. Gonzalez Mining twitter for adverse drug reaction mentions: a corpus and classification benchmark Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (BioTxtM) (2014) J. Cohen A coefficient of agreement for nominal scales Educ. Psychol. Meas., 20 (1) (1960), pp. 37–46 http://dx.doi.org/10.1177/001316446002000104 X. Zhou, X. Zhang, X. Hu Dragon toolkit: incorporating auto-learned semantic knowledge into large-scale text retrieval and mining Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence – ICTAI ’07, vol. 02, IEEE Computer Society, Washington, DC, USA (2007), pp. 197–201 http://dx.doi.org/10.1109/ICTAI.2007.90 A. Nikfarjam, E. Emadzadeh, G. Gonzalez A hybrid system for emotion extraction from suicide notes Biomed. Inform. Insights, 5 (2012), pp. 165–174 http://dx.doi.org/10.4137/BII.S8981 T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space. Available from: <1301.3781>. P. Nakov, S. Rosenthal, Z. Kozareva, V. Stoyanov, A. Ritter, T. Wilson Semeval-2013 task 2: sentiment analysis in twitter Second Joint Conference on Lexical and Computational Semantics (∗SEM), Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol. 2, Association for Computational Linguistics, Atlanta, Georgia, USA (2013), pp. 312–320 S. Rosenthal, A. Ritter, P. Nakov, V. Stoyanov Semeval-2014 task 9: sentiment analysis in twitter Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Association for Computational Linguistics and Dublin City University, Dublin, Ireland (2014), pp. 73–80 B. Pang, L. Lee, S. Vaithyanathan Thumbs up?: Sentiment classification using machine learning techniques Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing – EMNLP ’02, vol. 10, Association for Computational Linguistics, Stroudsburg, PA, USA (2002), pp. 79–86 http://dx.doi.org/10.3115/1118693.1118704 O. Owoputi, C. Dyer, K. Gimpel, N. Schneider Part-of-speech Tagging for Twitter: Word Clusters and Other Advances, Tech. Rep. CMU-ML-12-107 Machine Learning Department, Carnegie Mellon University (2012) R. Plutchik Emotions: a general psychoevolutionary theory K.R. Scherer, P. Ekman (Eds.), Approaches to Emotion, Lawrence Erlbaum, Hillsdale, N.J. (1984), pp. 197–219

PY - 2016/6/27

Y1 - 2016/6/27

N2 - Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.

AB - Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.

KW - Adverse drug reactions Social media Sentiment analysis Text mining

UR - http://www.journals.elsevier.com/journal-of-biomedical-informatics/

U2 - http://dx.doi.org/10.1016/j.jbi.2016.06.007

DO - http://dx.doi.org/10.1016/j.jbi.2016.06.007

M3 - Article

VL - 62

SP - 148

EP - 158

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

ER -