The Impact of Indirect Machine Translation on Sentiment Classification

Sentiment classification has been crucial for many natural language processing (NLP) applications, such as the analysis of movie reviews, tweets, or customer feedback. A sufficiently large amount of data is required to build a robust sentiment classification system. However, such resources are not always available for all domains or for all languages. In this work, we propose employing a machine translation (MT) system to translate customer feedback into another language to investigate in which cases translated sentences can have a positive or negative impact on an automatic sentiment classifier. Furthermore, as performing a direct translation is not always possible, we explore the performance of automatic classifiers on sentences that have been translated using a pivot MT system. We conduct several experiments using the above approaches to analyse the performance of our proposed sentiment classification system and discuss the advantages and drawbacks of classifying translated sentences.

[1]  Andy Way,et al.  Investigating Backtranslation in Neural Machine Translation , 2018, EAMT.

[2]  Watanabe Hideo,et al.  Deeper Sentiment Analysis Using Machine Translation Technology , 2004, COLING.

[3]  Andy Way,et al.  Maintaining Sentiment Polarity in Translation of User-Generated Content , 2017, Prague Bull. Math. Linguistics.

[4]  Antonio Toral,et al.  Post-editese: an Exacerbated Translationese , 2019, MTSummit.

[5]  Min-Yen Kan,et al.  Sentiment Aware Neural Machine Translation , 2019, EMNLP.

[6]  Hua Wu,et al.  Pivot language approach for phrase-based statistical machine translation , 2007, ACL.

[7]  Chen Zhang,et al.  Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks , 2019, EMNLP/IJCNLP.

[8]  Gerhard Backfried,et al.  The Impact of Machine Translation on Sentiment Analysis , 2016 .

[9]  Declan Groves,et al.  IJCNLP-2017 Task 4: Customer Feedback Analysis , 2017, IJCNLP.

[10]  Matteo Negri,et al.  Machine Translation for Machines: the Sentiment Classification Use Case , 2019, EMNLP.

[11]  Guillaume Wisniewski,et al.  Comparison between NMT and PBSMT Performance for Translating Noisy User-Generated Content , 2019, NODALIDA.

[12]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[13]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[14]  Yang Liu,et al.  Joint Training for Pivot-based Neural Machine Translation , 2016, IJCAI.

[15]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[16]  Jun Suzuki,et al.  JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus , 2020, LREC.

[17]  Andy Way,et al.  Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation , 2019, MTSummit.

[18]  Andy Way,et al.  Pivot Machine Translation Using Chinese as Pivot Language , 2018, Communications in Computer and Information Science.

[19]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[20]  Barbara Plank All-In-1 at IJCNLP-2017 Task 4: Short Text Classification with One Model for All Languages , 2017, IJCNLP.

[21]  Andy Way,et al.  A systematic comparison between SMT and NMT on translating user-generated content , 2019 .

[22]  Andy Way,et al.  Extracting In-domain Training Corpora for Neural Machine Translation Using Data Selection Methods , 2018, WMT.

[23]  Chu-Ren Huang,et al.  Sentiment Classification and Polarity Shifting , 2010, COLING.

[24]  Christof Monz,et al.  Dynamic Data Selection for Neural Machine Translation , 2017, EMNLP.

[25]  Hitoshi Isahara,et al.  A Comparison of Pivot Methods for Phrase-Based Statistical Machine Translation , 2007, NAACL.

[26]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[27]  Andy Way,et al.  Feature Decay Algorithms for Neural Machine Translation , 2018, EAMT.

[28]  Dawei Song,et al.  Syntax-Aware Aspect-Level Sentiment Classification with Proximity-Weighted Convolution Network , 2019, SIGIR.

[29]  Alexandra Balahur,et al.  Multilingual Sentiment Analysis using Machine Translation? , 2012, WASSA@ACL.

[30]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[31]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[32]  Andy Way,et al.  Adapting NMT to caption translation inWikimedia Commons for low-resource languages , 2019, Proces. del Leng. Natural.

[33]  Qun Liu,et al.  Understanding Meanings in Multilingual Customer Feedback , 2018, ArXiv.

[34]  Chafik Aloulou,et al.  Arabic Sentiment Analysis: An Empirical Study of Machine Translation's Impact , 2018, LPKM.

[35]  Júlio Cesar dos Reis,et al.  An evaluation of machine translation for multilingual sentence-level sentiment analysis , 2016, SAC.

[36]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.