Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings

Contextualized word embeddings (CWE) such as provided by ELMo (Peters et al., 2018), Flair NLP (Akbik et al., 2018), or BERT (Devlin et al., 2019) are a major recent innovation in NLP. CWEs provide semantic vector representations of words depending on their respective context. Their advantage over static word embeddings has been shown for a number of tasks, such as text classification, sequence tagging, or machine translation. Since vectors of the same word type can vary depending on the respective context, they implicitly provide a model for word sense disambiguation (WSD). We introduce a simple but effective approach to WSD using a nearest neighbor classification on CWEs. We compare the performance of different CWE models for the task and can report improvements above the current state of the art for two standard WSD benchmark datasets. We further show that the pre-trained BERT model is able to place polysemic words into distinct 'sense' regions of the embedding space, while ELMo and Flair NLP do not seem to possess this ability.

[1]  Karsten Müller,et al.  Fanning the Flames of Hate: Social Media and Hate Crime , 2020, Journal of the European Economic Association.

[2]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[3]  Rudresh Panchal,et al.  Online hatred of women in the Incels.me forum , 2019, Journal of Language Aggression and Conflict.

[4]  Alexander Mehler,et al.  BIOfid Dataset: Publishing a German Gold Standard for Named Entity Recognition in Historical Biodiversity Literature , 2019, CoNLL.

[5]  Ben Burtenshaw,et al.  Offence in Dialogues: A Corpus-Based Study , 2019, RANLP.

[6]  Tom De Smedt,et al.  Right-wing German Hate Speech on Twitter: Analysis and Automatic Detection , 2019, ArXiv.

[7]  Bela Gipp,et al.  Enriching BERT with Knowledge Graph Embeddings for Document Classification , 2019, KONVENS.

[8]  Fernando Benites TwistBytes - Hierarchical Classification at GermEval 2019: walking the fine line (of recall and precision) , 2019, KONVENS.

[9]  Benoît Sagot,et al.  What Does BERT Learn about the Structure of Language? , 2019, ACL.

[10]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[11]  Chris Biemann,et al.  Hierarchical Multi-label Classification of Text with Capsule Networks , 2019, ACL.

[12]  Michael Richter,et al.  Interaction of Information Content and Frequency as Predictors of Verbs' Lengths , 2019, BIS.

[13]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[14]  Liang Zou,et al.  NULI at SemEval-2019 Task 6: Transfer Learning for Offensive Language Detection using Bidirectional Transformers , 2019, *SEMEVAL.

[15]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[16]  Ali Hakimi Parizi,et al.  UNBNLP at SemEval-2019 Task 5 and 6: Using Language Models to Detect Hate Speech and Offensive Language , 2019, SemEval@NAACL-HLT.

[17]  Constantin Orasan,et al.  RGCL-WLV at SemEval-2019 Task 12: Toponym Detection , 2019, *SEMEVAL.

[18]  Jianming Wang,et al.  BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model , 2019, *SEMEVAL.

[19]  Giuseppe De Pietro,et al.  Deep neural network for hierarchical extreme multi-label text classification , 2019, Appl. Soft Comput..

[20]  Ion Androutsopoulos,et al.  Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation , 2019, Proceedings of the Natural Legal Language Processing Workshop 2019.

[21]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[22]  Benjamin Lecouteux,et al.  Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation , 2019, GWC.

[23]  SangKeun Lee,et al.  From Small-scale to Large-scale Text Classification , 2019, WWW.

[24]  Douglas Biber,et al.  Register, Genre, and Style , 2019 .

[25]  Ingmar Weber,et al.  Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[26]  Jimmy J. Lin,et al.  DocBERT: BERT for Document Classification , 2019, ArXiv.

[27]  Alexander Peysakhovich,et al.  PyTorch-BigGraph: A Large-scale Graph Embedding System , 2019, SysML.

[28]  Preslav Nakov,et al.  SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.

[29]  Dmitry Mouromtsev,et al.  Relation Extraction Datasets in the Digital Humanities Domain and their Evaluation with Word Embeddings , 2019, ArXiv.

[30]  Preslav Nakov,et al.  Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.

[31]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[32]  Holger Schwenk,et al.  Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.

[33]  Rahul Goel,et al.  Online Embedding Compression for Text Classification using Low Rank Matrix Factorization , 2018, AAAI.

[34]  Yifan Peng,et al.  BioSentVec: creating sentence embeddings for biomedical texts , 2018, 2019 IEEE International Conference on Healthcare Informatics (ICHI).

[35]  Adam Lopez,et al.  Pre-training on high-resource speech recognition improves low-resource speech-to-text translation , 2018, NAACL.

[36]  Athena Vakali,et al.  A Unified Deep Learning Architecture for Abuse Detection , 2018, WebSci.

[37]  A. Barbaresi The Vast and the Focused: On the need for thematic web and blog corpora , 2019 .

[38]  Christian Biemann,et al.  GermEval 2019 Task 1: Hierarchical Classification of Blurbs , 2019, KONVENS.

[39]  Achim Rettberg,et al.  Logistic Regression and Naive Bayes for Hierarchical Multi-label Classification at GermEval 2019 - Task 1 , 2019, KONVENS.

[40]  Sandra Kübler,et al.  The HUIU Contribution to the GermEval 2019 Shared Task 1 , 2019, KONVENS.

[41]  David S. Batista,et al.  COMTRAVO-DS team at GermEval 2019 Task 1 on Hierarchical Classification of Blurbs , 2019, KONVENS.

[42]  Peter Klügl,et al.  Convolutional Neural Networks for Classification of German Blurbs , 2019, KONVENS.

[43]  Venkatesh Umaashankar,et al.  Multi-Label Multi-Class Hierarchical Classification using Convolutional Seq2Seq , 2019, KONVENS.

[44]  Dirk Labudde,et al.  Multi-Label Classification of Blurbs with SVM Classifier Chains , 2019, KONVENS.

[45]  K. RaghavanA.,et al.  Label Frequency Transformation for Multi-Label Multi-Class Text Classification , 2019, KONVENS.

[46]  Michael Wiegand,et al.  Overview of GermEval Task 2, 2019 Shared Task on the Identification of Offensive Language , 2019, KONVENS.

[47]  Indra Budi,et al.  Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[48]  Finn Årup Nielsen Danish in Wikidata lexemes , 2019, GWC.

[49]  Hans-Christian Schmitz,et al.  KoGra-R: Standardisierte statistische Auswertung von Korpusrecherchen , 2019 .

[50]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[51]  Hugo Jair Escalante,et al.  Overview of MEX-A3T at IberLEF 2019: Authorship and Aggressiveness Analysis in Mexican Spanish Tweets , 2018, IberLEF@SEPLN.

[52]  Gaël Lejeune,et al.  A New Proposal for Evaluating Web Page Cleaning Tools , 2018, Computación y Sistemas.

[53]  Deepti Mehrotra,et al.  Comparative Analysis of Multi-label Classification Algorithms , 2018, 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC).

[54]  Lior Shamir,et al.  Quantitative Sentiment Analysis of Lyrics in Popular Music , 2018, Journal of Popular Music Studies.

[55]  Christian Biemann,et al.  Transfer Learning from LDA to BiLSTM-CNN for Offensive Language Detection in Twitter , 2018, ArXiv.

[56]  Benjamin Lecouteux,et al.  Improving the Coverage and the Generalization Ability of Neural Word Sense Disambiguation through Hypernymy and Hyponymy Relationships , 2018, ArXiv.

[57]  Ying Liu,et al.  A Comparison of 1-D and 2-D Deep Convolutional Neural Networks in ECG Classification , 2018, Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference.

[58]  Kathleen McKeown,et al.  Predictive Embeddings for Hate Speech Detection on Twitter , 2018, ALW.

[59]  M. Klenner,et al.  Offensive language without offensive words (OLWOW) , 2018 .

[60]  Serena Villata,et al.  InriaFBK at Germeval 2018: Identifying Offensive Tweets Using Recurrent Neural Networks , 2018 .

[61]  Ona de Gibert,et al.  Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[62]  Ritesh Kumar,et al.  Benchmarking Aggression Identification in Social Media , 2018, TRAC@COLING 2018.

[63]  Roland Vollgraf,et al.  Contextual String Embeddings for Sequence Labeling , 2018, COLING.

[64]  Sérgio Nunes,et al.  A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[65]  Alexander Mehler,et al.  Resource-Size Matters: Improving Neural Named Entity Recognition with Optimized Large Corpora , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[66]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[67]  Alex Alves Freitas,et al.  A Survey of Genetic Algorithms for Multi-Label Classification , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[68]  Fabienne Baider,et al.  “Go to hell fucking faggots, may you die!” framing the LGBT subject in online comments , 2018, Lodz Papers in Pragmatics.

[69]  Martin Volk,et al.  Cutter - a Universal Multilingual Tokenizer , 2018, SwissText.

[70]  Michael Wiegand,et al.  Inducing a Lexicon of Abusive Words – a Feature-Based Approach , 2018, NAACL.

[71]  José Camacho-Collados,et al.  From Word to Sense Embeddings: A Survey on Vector Representations of Meaning , 2018, J. Artif. Intell. Res..

[72]  Benjamin Lecouteux,et al.  UFSAC: Unification of Sense Annotated Corpora and Tools , 2018, LREC.

[73]  Marc Kupietz,et al.  The German Reference Corpus DeReKo: New Developments - New Opportunities , 2018, LREC.

[74]  Christian Biemann,et al.  Retrofitting Word Representations for Unsupervised Sense Aware Word Similarities , 2018, LREC.

[75]  Serge Sharoff,et al.  Functional text dimensions for the annotation of web corpora , 2018 .

[76]  Guy De Pauw,et al.  Automatic Detection of Online Jihadist Hate Speech , 2018, ArXiv.

[77]  Isabelle Augenstein,et al.  Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces , 2018, NAACL.

[78]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[79]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[80]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[81]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[82]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[83]  Mathieu Ravaut,et al.  Gradient descent revisited via an adaptive online learning rate , 2018 .

[84]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[85]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[86]  Benjamin Heinzerling,et al.  BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages , 2017, LREC.

[87]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[88]  Matteo Pagliardini,et al.  Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[89]  Hierarchical writing genre classification with neural networks , 2018 .

[90]  Giuseppe G. A. Celano,et al.  Aspect coding asymmetries of verbs: The case of Russian , 2018, KONVENS.

[91]  Michael Wiegand,et al.  Overview of the GermEval 2018 Shared Task on the Identification of Offensive Language , 2018 .

[92]  Mark Cieliebak,et al.  spMMMP at GermEval 2018 shared task : classification of offensive content in tweets using convolutional neural networks and gated recurrent units , 2018 .

[93]  Dominik Stammbach Offensive Language Detection with Neural Networks for Germeval Task 2018 , 2018 .

[94]  Ralf Krestel,et al.  Fine-Grained Classification of Offensive Language , 2018 .

[95]  Joaquín Padilla Montani,et al.  GermEval 2018 : German Abusive Tweet Detection , 2018 .

[96]  Fumiyo Fukumoto,et al.  HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization , 2018, EMNLP.

[97]  Jörg Becker,et al.  Discussing the Value of Automatic Hate Speech Detection in Online Debates , 2018 .

[98]  Paolo Rosso,et al.  Overview of the Task on Automatic Misogyny Identification at IberEval 2018 , 2018, IberEval@SEPLN.

[99]  Josef Ruppenhofer,et al.  Guidelines for IGGSA Shared Task on the Identification of Offensive Language , 2018 .

[100]  Ping Fu,et al.  A Hierarchical Multi-Label Classification Algorithm for Gene Function Prediction , 2017 .

[101]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[102]  Sonja E. Bosch,et al.  A computational approach to Zulu verb morphology within the context of lexical semantics , 2017 .

[103]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[104]  Lei Gao,et al.  Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.

[105]  Finn Årup Nielsen,et al.  Wembedder: Wikidata entity embedding web service , 2017, ArXiv.

[106]  Reza Javidan,et al.  Spam filtering in SMS using recurrent neural networks , 2017, 2017 Artificial Intelligence and Signal Processing Conference (AISP).

[107]  Donald E. Brown,et al.  HDLTex: Hierarchical Deep Learning for Text Classification , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[108]  Yiming Yang,et al.  Deep Learning for Extreme Multi-label Text Classification , 2017, SIGIR.

[109]  Anna Korhonen,et al.  Initializing neural networks for hierarchical multi-label text classification , 2017, BioNLP.

[110]  Lothar Lemnitzer,et al.  Die Korpusplattform des „Digitalen Wörterbuchs der deutschen Sprache“ (DWDS) , 2017 .

[111]  Nikos Pelekis,et al.  DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[112]  Stefano Faralli,et al.  Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation , 2017, EMNLP.

[113]  Sampo Pyysalo,et al.  Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer , 2017, Bioinform..

[114]  Fabrício Benevenuto,et al.  A Measurement Study of Hate Speech in Social Media , 2017, HT.

[115]  Cody Buntain,et al.  A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.

[116]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[117]  Pascale Fung,et al.  One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[118]  Ingmar Weber,et al.  Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[119]  Simon Clematide,et al.  Verb-Mediated Composition of Attitude Relations Comprising Reader and Writer Perspective , 2017, CICLing.

[120]  Robyn Speer,et al.  ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge , 2017, *SEMEVAL.

[121]  Iryna Gurevych,et al.  EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION , 2017, *SEMEVAL.

[122]  Rodrigo C. Barros,et al.  Hierarchical multi-label classification with chained neural networks , 2017, SAC.

[123]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[124]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[125]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[126]  Sanja Fidler,et al.  Towards Diverse and Natural Image Descriptions via a Conditional GAN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[127]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[128]  Anna Korhonen,et al.  Text mining for improved exposure assessment , 2017, PloS one.

[129]  John G. Breslin,et al.  Knowledge Adaptation: Teaching to Adapt , 2017, ArXiv.

[130]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[131]  Jason Lee,et al.  Fully Character-Level Neural Machine Translation without Explicit Segmentation , 2016, TACL.

[132]  Hinrich Schütze,et al.  Nonsymbolic Text Representation , 2016, EACL.

[133]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[134]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[136]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[137]  Lars Kai Hansen,et al.  Open semantic analysis: The case of word level semantics in Danish , 2017 .

[138]  Fernando Benites de Azevedo e Souza,et al.  Multi-label Classification with Multiple Class Ontologies , 2017 .

[139]  Felice Dell'Orletta,et al.  Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[140]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[141]  Gerhard Backfried,et al.  Sentiment Analysis of Media in German on the Refugee Crisis in Europe , 2016, ISCRAM-med.

[142]  Chia-Hui Chang,et al.  Boosted Web Named Entity Recognition via Tri-Training , 2016, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[143]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Reduction strategies for hierarchical multi-label classification in protein function prediction , 2016, BMC Bioinformatics.

[144]  Adrien Barbaresi Efficient construction of metadata-enhanced web corpora , 2016, WAC@ACL.

[145]  Christian Biemann,et al.  Making Sense of Word Embeddings , 2016, Rep4NLP@ACL.

[146]  Sampo Pyysalo,et al.  How to Train good Word Embeddings for Biomedical NLP , 2016, BioNLP@ACL.

[147]  Stefan Evert,et al.  EmpiriST 2015: A Shared Task on the Automatic Linguistic Annotation of Computer-Mediated Communication and Web Corpora , 2016, WAC@ACL.

[148]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[149]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[150]  ChengXiang Zhai,et al.  DeepMeSH: deep semantic representation for improving large-scale MeSH indexing , 2016, Bioinform..

[151]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[152]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[153]  Mikael Kågebäck,et al.  Word Sense Disambiguation using a Bidirectional LSTM , 2016, CogALex@COLING.

[154]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[155]  Leonie Rösner,et al.  Dangerous minds? Effects of uncivil online comments on aggressive cognitions, emotions, and behavior , 2016, Comput. Hum. Behav..

[156]  Ulf Leser,et al.  SCARE ― The Sentiment Corpus of App Reviews with Fine-grained Annotations in German , 2016, LREC.

[157]  Joachim Bingel,et al.  KorAP Architecture ― Diving in the Deep Sea of Corpus Data , 2016, LREC.

[158]  Heiko Motschenbacher A corpus linguistic study of the situatedness of English pop song lyrics , 2016 .

[159]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[160]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[161]  Irene Zempi,et al.  The affinity between online and offline anti-muslim hate crime: dynamics and impacts , 2016 .

[162]  Ryan Doherty,et al.  Semi-supervised Word Sense Disambiguation with Neural Models , 2016, COLING.

[163]  M. Williams,et al.  Cyber-hate on social media in the aftermath of Woolwich , 2015 .

[164]  Anna Korhonen,et al.  Automatic semantic classification of scientific literature according to the hallmarks of cancer , 2016, Bioinform..

[165]  Passent El Kafrawy,et al.  Experimental Comparison of Methods for Multi-label Classification in different Application Domains , 2015 .

[166]  Anton Osokin,et al.  Breaking Sticks and Ambiguities with Adaptive Skip-gram , 2015, AISTATS.

[167]  Sabine Krome,et al.  Fremdwörter zwischen Isolation und Integration. Empirische Analysen zum Schreibusus auf der Basis von Textkorpora professioneller und informeller Schreiber , 2016 .

[168]  Michael Wiegand,et al.  Overview of the IGGSA 2016 Shared Task on Source and Target Extraction from Political Speeches , 2016 .

[169]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[170]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[171]  Zhong Jin,et al.  Document Sentiment Classification based on the Word Embedding , 2015, ICM 2015.

[172]  S. Malinen,et al.  “Leave Your Comment Below”: Can Biased Online Comments Influence Our Own Prejudicial Attitudes and Behaviors? , 2015 .

[173]  Travis E. Oliphant,et al.  Guide to NumPy , 2015 .

[174]  Francisco Charte,et al.  Addressing imbalance in multilabel classification: Measures and random resampling algorithms , 2015, Neurocomputing.

[175]  Walter Daelemans,et al.  Detection and Fine-Grained Classification of Cyberbullying Events , 2015, RANLP.

[176]  Mark Johnson,et al.  An Improved Non-monotonic Transition System for Dependency Parsing , 2015, EMNLP.

[177]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[178]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[179]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[180]  Roberto Navigli,et al.  SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[181]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[182]  Geoffrey Zweig,et al.  Language Models for Image Captioning: The Quirks and What Works , 2015, ACL.

[183]  Eckhard Bick,et al.  CG-3 - Beyond Classical Constraint Grammar , 2015, NODALIDA.

[184]  Jannis Androutsopoulos,et al.  Networked multilingualism: Some language practices on Facebook and their implications , 2015 .

[185]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[186]  Georgios Paliouras,et al.  LSHTC: A Benchmark for Large-Scale Text Classification , 2015, ArXiv.

[187]  W. Alkema,et al.  Application of text mining in the biomedical domain. , 2015, Methods.

[188]  Nina Springer,et al.  User comments: motives and inhibitors to write and read , 2015 .

[189]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[190]  Jonathan Tompson,et al.  Efficient object localization using Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[191]  R. Kreyer “Funky fresh dressed to impress”: A corpus-linguistic view on gender roles in pop songs , 2015 .

[192]  Hwee Tou Ng,et al.  Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains , 2015, NAACL.

[193]  Ted Underwood,et al.  Understanding Genre in a Collection of a Million Volumes, Interim Report , 2014 .

[194]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[195]  Michael Xavier Collins Information Density and Dependency Length as Complementary Cognitive Models , 2014, Journal of psycholinguistic research.

[196]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[197]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[198]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[199]  Stephen A. Rains,et al.  Online and Uncivil? Patterns and Determinants of Incivility in Newspaper Website Comments , 2014 .

[200]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[201]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[202]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[203]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Hierarchical multi-label classification using local neural networks , 2014, J. Comput. Syst. Sci..

[204]  Julia Maria Struß,et al.  IGGSA Shared Tasks on German Sentiment Analysis (GESTALT) , 2014 .

[205]  Frank Hakemulder,et al.  Exploring absorbing reading experiences: : Developing and validating a self-report scale to measure story world absorption , 2014 .

[206]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[207]  Josef Ruppenhofer,et al.  IGGSA-STEPS: Shared Task on Source and Target Extraction from Political Speeches , 2014, J. Lang. Technol. Comput. Linguistics.

[208]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[209]  M. Morini Towards a musical stylistics: Movement in Kate Bush’s ‘Running Up That Hill’ , 2013 .

[210]  Johan Bos,et al.  Elephant: Sequence Labeling for Word and Sentence Segmentation , 2013, EMNLP.

[211]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[212]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[213]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[214]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[215]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[216]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[217]  Roger Levy,et al.  Memory and surprisal in human sentence comprehension , 2013 .

[218]  Margaret Mitchell,et al.  Overview of the TAC2013 Knowledge Base Population Evaluation: English Sentiment Slot Filling , 2013, TAC.

[219]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[220]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[221]  Simon Clematide,et al.  MLSA - A Multi-layered Reference Corpus for German Sentiment Analysis , 2012, LREC.

[222]  Rainer Perkuhn,et al.  Korpuslinguistik , 2012 .

[223]  Thomas Eckart,et al.  Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages , 2012, LREC.

[224]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A genetic algorithm for Hierarchical Multi-Label Classification , 2012, SAC '12.

[225]  Mike Schuster,et al.  Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[226]  Kathrin Beck,et al.  Stylebook for the Tubingen Treebank of Written German (TuBa-D/Z) , 2012 .

[227]  Dino Isa,et al.  An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization , 2011, Applied Intelligence.

[228]  Fernando Benites,et al.  An Empirical Comparison of Flat and Hierarchical Performance Measures for Multi-Label Classification with Hierarchy Extraction , 2011, KES.

[229]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[230]  Steven T Piantadosi,et al.  Word lengths are optimized for efficient communication , 2011, Proceedings of the National Academy of Sciences.

[231]  Sonja E. Bosch,et al.  Towards Zulu corpus clean-up, lexicon development and corpus annotation by means of computational morphological analysis , 2011 .

[232]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[233]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[234]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[235]  Christine M. Pittman,et al.  Word-formation by phase in Inuit , 2010 .

[236]  T. Florian Jaeger,et al.  Redundancy and reduction: Speakers manage syntactic information density , 2010, Cognitive Psychology.

[237]  Anders Søgaard,et al.  Simple Semi-Supervised Training of Part-Of-Speech Taggers , 2010, ACL.

[238]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[239]  Gerhard Heyer,et al.  SentiWS - A Publicly Available German-language Resource for Sentiment Analysis , 2010, LREC.

[240]  L. Lemnitzer,et al.  Korpuslinguistik : eine Einführung , 2010 .

[241]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[242]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-8: A Step Toward Cross Lingual Opinion Analysis , 2010, NTCIR.

[243]  Mohammad S. Sorower A Literature Survey on Algorithms for Multi-label Learning , 2010 .

[244]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[245]  Janyce Wiebe,et al.  Subjectivity Word Sense Disambiguation , 2009, EMNLP.

[246]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[247]  Lars Trap-Jensen,et al.  DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary , 2009, Lang. Resour. Evaluation.

[248]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[249]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[250]  E. Warrington,et al.  The different representational frameworks underpinning abstract and concrete knowledge: Evidence from odd-one-out judgements , 2009, Quarterly journal of experimental psychology.

[251]  Manfred Klenner,et al.  PolArt: A Robust Tool for Sentiment Analysis , 2009, NODALIDA.

[252]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[253]  John Langford,et al.  Multi-Label Prediction via Compressed Sensing , 2009, NIPS.

[254]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[255]  Jian Hu,et al.  Using Wikipedia knowledge to improve text classification , 2009, Knowledge and Information Systems.

[256]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[257]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[258]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[259]  Swapna Somasundaran,et al.  Finding the Sources and Targets of Subjective Expressions , 2008, LREC.

[260]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[261]  Uriel Cohen Priva Using Information Content to PredictPhone Deletion , 2008 .

[262]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[263]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[264]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[265]  Joybrato Mukherjee,et al.  The Style of Pop Song Lyrics: A Corpus-linguistic Pilot Study , 2007 .

[266]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[267]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[268]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[269]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[270]  Adam Kilgarriff Googleology is Bad Science , 2007, Computational Linguistics.

[271]  Guido van Rossum,et al.  Python Programming Language , 2007, USENIX Annual Technical Conference.

[272]  Eckhard Bick,et al.  Dan2eng: wide-coverage Danish-English machine translation , 2007, MTSUMMIT.

[273]  Johannes Fürnkranz,et al.  An Evaluation of Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain , 2007, LWA.

[274]  Sentence and Token Splitting Based On Conditional Random Fields , 2007 .

[275]  Hsin-Hsi Chen,et al.  Overview of Opinion Analysis Pilot Task at NTCIR-6 , 2007, NTCIR.

[276]  Roger Levy,et al.  Speakers optimize information density through syntactic reduction , 2006, NIPS.

[277]  Juho Rousu,et al.  Kernel-Based Learning of Hierarchical Multilabel Classification Models , 2006, J. Mach. Learn. Res..

[278]  Tibor Kiss,et al.  Unsupervised Multilingual Sentence Boundary Detection , 2006, CL.

[279]  José María Gómez Hidalgo,et al.  Content based SMS spam filtering , 2006, DocEng '06.

[280]  Robert E. Schapire,et al.  Hierarchical multi-label prediction of gene function , 2006, Bioinform..

[281]  Sonja E. Bosch,et al.  A finite-state approach to linguistic constraints in Zulu morphological analysis , 2006 .

[282]  Georg Rehm Hypertextsorten: Definition - Struktur - Klassifikation , 2006 .

[283]  Markus Koppenberger,et al.  Natural language processing of lyrics , 2005, ACM Multimedia.

[284]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[285]  Xiao-Ping Zhang,et al.  Advances in Intelligent Computing, International Conference on Intelligent Computing, ICIC 2005, Hefei, China, August 23-26, 2005, Proceedings, Part I , 2005, ICIC.

[286]  Zhi-Hua Zhou,et al.  A k-nearest neighbor based algorithm for multi-label classification , 2005, 2005 IEEE International Conference on Granular Computing.

[287]  Ulrich Miethaner i Can Look Through Muddy Water: Analyzing Earlier African American English in Blues Lyrics (Blur) , 2005 .

[288]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[289]  Jae Dong Yang,et al.  Experiment with a Hierarchical Text Categorization Method on WIPO Patent Collections , 2005 .

[290]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[291]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[292]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[293]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[294]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[295]  Katrin Erk,et al.  A Powerful and Versatile XML Format for Representing Role-semantic Annotation , 2004, LREC.

[296]  Alice Turk,et al.  The Smooth Signal Redundancy Hypothesis: A Functional Explanation for Relationships between Redundancy, Prosodic Prominence, and Duration in Spontaneous Speech , 2004, Language and speech.

[297]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[298]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[299]  Eugene Charniak,et al.  Entropy Rate Constancy in Text , 2002, ACL.

[300]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[301]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[302]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[303]  David Y. W. Lee,et al.  Genres, Registers, Text Types, Domains and Styles: Clarifying the Concepts and Navigating a Path through the BNC Jungle , 2001 .

[304]  Adam Kilgarriff,et al.  English Lexical Sample Task Description , 2001, *SEMEVAL.

[305]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[306]  John Hale,et al.  A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.

[307]  Jeannine Bell,et al.  Hate Crimes: Criminal Law and Identity Politics , 2001 .

[308]  J. Bresnan Lexical-Functional Syntax , 2000 .

[309]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[310]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[311]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[312]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[313]  G. Altmann,et al.  Incremental interpretation at verbs: restricting the domain of subsequent reference , 1999, Cognition.

[314]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[315]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[316]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[317]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[318]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[319]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[320]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[321]  Daphne Koller,et al.  Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[322]  Marti A. Hearst,et al.  Adaptive Multilingual Sentence Boundary Disambiguation , 1997, CL.

[323]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Approach to Identifying Sentence Boundaries , 1997, ANLP.

[324]  Ted Briscoe,et al.  Automatic Extraction of Subcategorization from Corpora , 1997, ANLP.

[325]  Pasi Tapanainen,et al.  What is a word, What is a sentence? Problems of Tokenization , 1994 .

[326]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[327]  Morris Halle,et al.  Distributed morphology and the pieces of inflection , 1993 .

[328]  T. Murphey The Discourse of Pop Songs , 1992 .

[329]  David D. Lewis,et al.  An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[330]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[331]  Douglas Biber,et al.  Variation across speech and writing: Methodology , 1988 .

[332]  Jerrold M. Sadock,et al.  Noun incorporation in Greenlandic: A case of syntactic word formation , 1980 .

[333]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[334]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[335]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[336]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[337]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[338]  Ferdinand de Saussure Grundfragen der allgemeinen Sprachwissenschaft , 1931 .

[339]  Challenges of Automatically Detecting Offensive Language Online : Participation Paper for the Germeval Shared Task 2018 ( H a UA ) , 2022 .