BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter

This paper describes our team’s submission for the Social Media Mining for Health (SMM4H) 2021 shared task. We participated in three subtasks: Classifying adverse drug effect, COVID-19 self-report, and COVID-19 symptoms. Our system is based on BERT model pre-trained on the domain-specific text. In addition, we perform data cleaning and augmentation, as well as hyperparameter optimization and model ensemble to further boost the BERT performance. We achieved the first rank in both classifying adverse drug effects and COVID-19 self-report tasks.

[1]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[2]  A. Sarker,et al.  Self-reported COVID-19 symptoms on Twitter: an analysis and a research resource , 2020, medRxiv.

[3]  Marcel Salathé,et al.  COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter , 2020, Frontiers in Artificial Intelligence.

[4]  Dat Quoc Nguyen,et al.  BERTweet: A pre-trained language model for English Tweets , 2020, EMNLP.

[5]  Michael J. Paul,et al.  Overview of the Fourth Social Media Mining for Health (SMM4H) Shared Tasks at ACL 2019 , 2019, Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task.

[6]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[7]  Pranava Madhyastha,et al.  On Model Stability as a Function of Random Seed , 2019, CoNLL.

[8]  J. Brownstein,et al.  Evaluation of Facebook and Twitter Monitoring to Detect Safety Signals for Medical Products: An Analysis of Recent FDA Safety Alerts , 2017, Drug Safety.

[9]  Sukhpal Kaur,et al.  Hyper-parameter optimization of deep learning model for prediction of Parkinson’s disease , 2020, Machine Vision and Applications.

[10]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[11]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[12]  Michael J. Paul,et al.  Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018 , 2018, EMNLP 2018.

[13]  J. Duncan,et al.  AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients , 2020, NeurIPS.

[14]  Buzhou Tang,et al.  HITSZ-ICRC: A Report for SMM4H Shared Task 2019-Automatic Classification and Extraction of Adverse Effect Mentions in Tweets , 2019, Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task.

[15]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[16]  Benyuan Liu,et al.  Predicting Flu Trends using Twitter data , 2011, 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[17]  Li Yang,et al.  On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice , 2020, Neurocomputing.

[18]  Ralf Krestel,et al.  Bagging BERT Models for Robust Aggression Identification , 2020, TRAC.

[19]  Tirana Noor Fatyanosa,et al.  Effects of the Number of Hyperparameters on the Performance of GA-CNN , 2020, 2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT).

[21]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[22]  Martin Krallinger,et al.  Overview of the Sixth Social Media Mining for Health Applications (#SMM4H) Shared Tasks at NAACL 2021 , 2021, SMM4H.

[23]  Kevin Duh,et al.  Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems , 2020, Transactions of the Association for Computational Linguistics.

[24]  Nikos Pelekis,et al.  DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[25]  Walter Scheirer,et al.  Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation , 2019, EMNLP.

[26]  Anne Cocos,et al.  Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts , 2017, J. Am. Medical Informatics Assoc..

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  A. Magge,et al.  Overview of the Fifth Social Media Mining for Health Applications (#SMM4H) Shared Tasks at COLING 2020 , 2020, SMM4H.

[29]  A. Magge,et al.  Toward Using Twitter for Tracking COVID-19: A Natural Language Processing Pipeline and Exploratory Data Set , 2020, Journal of medical Internet research.

[30]  Mirella Lapata,et al.  Paraphrasing Revisited with Neural Machine Translation , 2017, EACL.