The Modern Greek Language on the Social Web: A Survey of Data Sets and Mining Applications

Mining social web text has been at the heart of the Natural Language Processing and Data Mining research community in the last 15 years. Though most of the reported work is on widely spoken languages, such as English, the significance of approaches that deal with less commonly spoken languages, such as Greek, is evident for reasons of preserving and documenting minority languages, cultural and ethnic diversity, and identifying intercultural similarities and differences. The present work aims at identifying, documenting and comparing social text data sets, as well as mining techniques and applications on social web text that target Modern Greek, focusing on the arising challenges and the potential for future research in the specific less widely spoken language.

[1]  Sharon Goldwater,et al.  Inducing a lexicon of sociolinguistic variables from code-mixed text , 2018, NUT@EMNLP.

[2]  Dipankar Das,et al.  A Practical Guide to Sentiment Analysis , 2017 .

[3]  Katia Lida Kermanidis,et al.  Virtual learning communities (VLCs) rethinking: influence on behavior modification—bullying detection through machine learning and natural language processing , 2020 .

[4]  James Clackson,et al.  Indo-European Linguistics: An Introduction , 2007 .

[5]  Marcos Zampieri,et al.  Offensive Language Identification in Greek , 2020, LREC.

[6]  George K. Mikros,et al.  Sentiment Analysis of Hotel Reviews in Greek: A Comparison of Unigram Features , 2015 .

[7]  Patrizia Grifoni,et al.  Approaches, Tools and Applications for Sentiment Analysis Implementation , 2015 .

[8]  Katia Kermanidis,et al.  A Supervised Part-Of-Speech Tagger for the Greek Language of the Social Web , 2020, LREC.

[9]  David Vilares,et al.  BabelSenticNet: A Commonsense Reasoning Framework for Multilingual Sentiment Analysis , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[10]  Vangelis Karkaletsis,et al.  Argument Extraction from News, Blogs, and Social Media , 2014, SETN.

[11]  Yuwei Wang,et al.  HRCE: Detecting Food Security Events in Social Media , 2020, Journal of Physics: Conference Series.

[12]  Katia Lida Kermanidis,et al.  A machine learning approach for gender identification of Greek tweet authors , 2020, PETRA.

[13]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[14]  Éric Gaussier,et al.  A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[15]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[16]  Nikos Fakotakis,et al.  Challenges in Extracting Terminology from Modern Greek Texts , 2006 .

[17]  Barbara McGillivray,et al.  Reconstructing constructional semantics: the dative subject construction in Old Norse-Icelandic, Latin, Ancient Greek, Old Russian and Old Lithuanian , 2012 .

[18]  George Papastefanatos,et al.  "Just the Facts" with PALOMAR: Detecting Protest Events in Media Outlets and Twitter , 2016, SMN@ICWSM.

[19]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[20]  Felipe Maia Galvão França,et al.  Multilingual part-of-speech tagging with weightless neural networks , 2015, Neural Networks.

[21]  Serkan Ayvaz,et al.  Sentiment analysis on Twitter: A text mining approach to the Syrian refugee crisis , 2018, Telematics Informatics.

[22]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[23]  Christiane Fellbaum,et al.  A Multilingual Lexico-Semantic Database and Ontology , 2014, Towards the Multilingual Semantic Web.

[24]  Raymond Chiong,et al.  Multilingual sentiment analysis: from formal to informal and scarce resource languages , 2016, Artificial Intelligence Review.

[25]  Ngo Xuan Bach,et al.  An empirical study on POS tagging for Vietnamese social media text , 2018, Comput. Speech Lang..

[26]  Nikos Tsirakis,et al.  Sentiment Analysis for Reputation Management: Mining the Greek Web , 2014, SETN.

[27]  Dimitris Spathis,et al.  Detecting Irony on Greek Political Tweets: A Text Mining Approach , 2015, EANN '15.

[28]  Ana-Maria Popescu,et al.  Detecting controversial events from twitter , 2010, CIKM.

[29]  Vangelis Karkaletsis,et al.  Argument Extraction from News, Blogs, and the Social Web , 2015, Int. J. Artif. Intell. Tools.

[30]  Katia Kermanidis,et al.  Political sentiment analysis of tweets before and after the Greek elections of May 2012 , 2013, Int. J. Soc. Netw. Min..

[31]  Stefanos Nikiforos,et al.  Bullying Behavior and Project-based Activities in Virtual Learning Communities (VLCs) , 2020, 2020 5th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM).

[32]  Georgios Paliouras,et al.  Ellogon: A New Text Engineering Platform , 2002, LREC.

[33]  Aqil M. Azmi,et al.  Arabic tweets sentiment analysis – a hybrid scheme , 2016, J. Inf. Sci..

[34]  Eirini Takoulidou,et al.  Translation Crowdsourcing: Creating a Multilingual Corpus of Online Educational Content , 2018, LREC.

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Athena Stassopoulou,et al.  A Classifier to Distinguish Between Cypriot Greek and Standard Modern Greek , 2018, 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).

[37]  Özlem Uzuner,et al.  A Survey of Offensive Language Detection for the Arabic Language , 2021, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[38]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[39]  George K. Mikros Authorship Attribution and Gender Identification in Greek Blogs , 2013 .

[40]  2020 5th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM) , 2020 .

[41]  P. Read Montague,et al.  Reinforcement Learning: An Introduction, by Sutton, R.S. and Barto, A.G. , 1999, Trends in Cognitive Sciences.

[42]  Dimitris Spathis,et al.  A comparison between semi-supervised and supervised text mining techniques on detecting irony in greek political tweets , 2016, Eng. Appl. Artif. Intell..

[43]  Katia Lida Kermanidis,et al.  Virtual Learning Communities (VLCs) rethinking: From negotiation and conflict to prompting and inspiring , 2020, Education and Information Technologies.

[44]  Dimitrios Tziovas Greece in Crisis: The Cultural Politics of Austerity , 2017 .

[45]  Marco Wiering,et al.  Reinforcement Learning and Markov Decision Processes , 2012, Reinforcement Learning.

[46]  Stelios Piperidis,et al.  Verbal Aggression as an Indicator of Xenophobic Attitudes in Greek Twitter during and after the Financial Crisis , 2020, LR4SSHOC.

[47]  Alexander Erdmann,et al.  Complementary Strategies for Low Resourced Morphological Modeling , 2018 .

[48]  Ajay Lala,et al.  Sentiment Analysis of English Tweets Using Rapid Miner , 2015, 2015 International Conference on Computational Intelligence and Communication Networks (CICN).

[49]  Christian E. Lopez,et al.  Understanding the perception of COVID-19 policies by mining a multilanguage Twitter dataset , 2020, ArXiv.

[50]  Ana-Maria Popescu,et al.  Extracting events and event descriptions from Twitter , 2011, WWW.

[51]  Manolis Maragoudakis,et al.  A Novel, Gradient Boosting Framework for Sentiment Analysis in Languages where NLP Resources Are Not Plentiful: A Case Study for Modern Greek , 2017, Algorithms.

[52]  Haris Papageorgiou,et al.  Protest Event Analysis: A Longitudinal Analysis for Greece , 2020, AESPEN.

[53]  Maria Liakata,et al.  Nowcasting the Stance of Social Media Users in a Sudden Vote: The Case of the Greek Referendum , 2018, CIKM.

[54]  Josef van Genabith,et al.  #hardtoparse: POS Tagging and Parsing the Twitterverse , 2011, Analyzing Microtext.

[55]  Avi Arampatzis,et al.  Sentiment analysis of greek tweets and hashtags using a sentiment lexicon , 2015, Panhellenic Conference on Informatics.

[56]  V. Karkaletsis,et al.  Passive crowdsourcing in government using social media , 2014 .

[57]  S. Ioannidis,et al.  Social media analysis during political turbulence , 2017, PloS one.

[58]  D. Reibstein,et al.  Competitive Marketing Behavior in Industrial Markets , 1994 .

[59]  Felipe Bravo-Marquez,et al.  MāOri Loanwords: A Corpus of New Zealand English Tweets , 2019, ACL.

[60]  Konstantinos Tserpes,et al.  Comparing Methods for Twitter Sentiment Analysis , 2014, KDIR.

[61]  Vangelis Karkaletsis,et al.  Argument Extraction from News , 2015, ArgMining@HLT-NAACL.

[62]  Yannis Avrithis,et al.  A Contextual Personalization Approach Based on Ontological Knowledge , 2006, C&O@ECAI.