论文信息 - Semi-Automatic Training Set Construction for Supervised Sentiment Analysis in Polarized Contexts

Semi-Automatic Training Set Construction for Supervised Sentiment Analysis in Polarized Contexts

Standard sentiment analysis techniques usually rely either on sets of rules based on semantic and affective information or in machine learning approaches whose quality heavily depend on the size and significance of a training set of pre-labeled text samples. In many situations, this labeling needs to be performed by hand, potentially limiting the size of the training set. In order to address this issue, in this work we propose a methodology to retrieve text samples from Twitter and automatically label them. Additionally, we also tackle the situation in which the base rates of positive and negative sentiment samples in the training and test sets are biased with respect to the system in which the classifier is intended to be applied.

[1] Julio Villena-Román,et al. TASS 2015 - The Evolution of the Spanish Opinion Mining Systems , 2016, Proces. del Leng. Natural.

[2] J. C. Losada,et al. Multiple leaders on a multilayer social media , 2015 .

[3] S. Aguilera,et al. Measuring squid fishery governance efficacy: A social-ecological system analysis , 2018, International Journal of the Commons.

[4] Yunfang Chen,et al. A survey on sentiment analysis by using machine learning methods , 2017, 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC).

[5] Ingmar Weber,et al. Predicting ideological friends and foes in Twitter conflicts , 2014, WWW.

[6] Vadlamani Ravi,et al. A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[7] Victoria Bobicev,et al. Inter-Annotator Agreement in Sentiment Analysis: Machine Learning Perspective , 2017, RANLP.

[8] Lada A. Adamic,et al. The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[9] Walaa Medhat,et al. Sentiment analysis algorithms and applications: A survey , 2014 .

[10] F. Guerrero-Solé. Community Detection in Political Discussions on Twitter , 2017 .

[11] Lluís Padró,et al. FreeLing 1.3: Syntactic and semantic services in an open-source NLP library , 2006, LREC.

[12] Astrid Barrio,et al. Reducing the gap between leaders and voters? Elite polarization, outbidding competition, and the rise of secessionism in Catalonia , 2017 .

[13] Jacob Ratkiewicz,et al. Political Polarization on Twitter , 2011, ICWSM.

[14] Mirna Adriani,et al. Sentiment Lexicon Generation for an Under-Resourced Language , 2014, Int. J. Comput. Linguistics Appl..

[15] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16] Javier Borondo,et al. Opinion Polarization during a Dichotomous Electoral Process , 2019, Complex..