论文信息 - Stance Detection in Hindi-English Code-Mixed Data

Stance Detection in Hindi-English Code-Mixed Data

Social media sites such as Twitter, Facebook, and many other microblogging forums have emerged as a platform for people to express their opinions and perspectives on different events. People often tend to take a stance; in favor, against or neutral towards a particular topic on these platforms. Hindi and English are the most widely used languages on social media platforms in India, and the user predominantly expresses their opinions in Hindi-English code-mixed texts. As a result, knowing the diverse opinions of the masses is difficult. We target to classify Hindi-English code-mixed tweets based on their stance. A dataset consisting of 3545 English-Hindi code-mixed tweets with Demonetisation in the target is used in the experiments so far. We present a new stance annotated dataset of English-Hindi 4219 code-mixed tweets with the abrogation of article 370 in focus.

Vivek Srivastava | Jethva Utsav | Dhaiwat Kabaria | Ribhu Vajpeyi | Mohit Mina

[1] Suraj Tripathi,et al. Stance Detection in Code-Mixed Hindi-English Social Media Data using Multi-Task Learning , 2019, WASSA@NAACL-HLT.

[2] Vinay Singh,et al. An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System , 2018, ArXiv.

[3] Manish Shrivastava,et al. Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data , 2017, EACL.

[4] Riyaz Ahmad Bhat,et al. Universal Dependency Parsing for Hindi-English Code-Switching , 2018, NAACL.