Stance Detection in Hindi-English Code-Mixed Data
暂无分享,去创建一个
Social media sites such as Twitter, Facebook, and many other microblogging forums have emerged as a platform for people to express their opinions and perspectives on different events. People often tend to take a stance; in favor, against or neutral towards a particular topic on these platforms. Hindi and English are the most widely used languages on social media platforms in India, and the user predominantly expresses their opinions in Hindi-English code-mixed texts. As a result, knowing the diverse opinions of the masses is difficult. We target to classify Hindi-English code-mixed tweets based on their stance. A dataset consisting of 3545 English-Hindi code-mixed tweets with Demonetisation in the target is used in the experiments so far. We present a new stance annotated dataset of English-Hindi 4219 code-mixed tweets with the abrogation of article 370 in focus.
[1] Suraj Tripathi,et al. Stance Detection in Code-Mixed Hindi-English Social Media Data using Multi-Task Learning , 2019, WASSA@NAACL-HLT.
[2] Vinay Singh,et al. An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System , 2018, ArXiv.
[3] Manish Shrivastava,et al. Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data , 2017, EACL.
[4] Riyaz Ahmad Bhat,et al. Universal Dependency Parsing for Hindi-English Code-Switching , 2018, NAACL.