High Performance Computing for Understanding Natural Language

The amount of user-generated text available online is growing at an ever-increasing rate due to tremendous progress in enlarging inexpensive storage capacity, processing capabilities, and the popularity of online outlets and social networks. Learning language representation and solving tasks in an end-to-end manner, without a need for human-expert feature extraction and creation, has made models more accurate and much more complicated in the number of parameters, requiring parallelized and distributed resources high-performance computing or cloud. This chapter gives an overview of state-of-the-art natural language processing problems, algorithms, models, and libraries. Parallelized and distributed ways to solve text understanding, representation, and classification tasks are also discussed. Additionally, the importance of high-performance computing for natural language processing applications is illustrated by showing details of a few specific applications that use pre-training or self-supervised learning on large amounts of data in text understanding.

[1]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[2]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[3]  Abhijnan Chakraborty,et al.  Public Sphere 2.0: Targeted Commenting in Online News Media , 2019, ECIR.

[4]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.

[5]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[6]  Dimitrios Alikaniotis,et al.  The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction , 2019, BEA@ACL.

[7]  Zoran Obradovic,et al.  Surveying public opinion using label prediction on social media data , 2019, 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[8]  A. Szalay Science in an Exponential World , 2008 .

[9]  Zoran Obradovic,et al.  Biased News Data Influence on Classifying Social Media Posts , 2019, NewsIR@SIGIR.

[10]  Marc Ziegele,et al.  Conceptualizing Online Discussion Value: A Multidimensional Framework for Analyzing User Comments on Mass-Media Websites , 2013 .

[11]  C. Ruiz,et al.  Public Sphere 2.0? The Democratic Qualities of Citizen Debates in Online Newspapers , 2011 .

[12]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[13]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[14]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[17]  Patrick Weber,et al.  Discussions in the comments section: Factors influencing participation and interactivity in online newspapers’ reader comments , 2014, New Media Soc..

[18]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[21]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[22]  Naren Ramakrishnan,et al.  Forecasting rare disease outbreaks from open source indicators , 2017, Stat. Anal. Data Min..

[23]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[24]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[25]  Heng Ji,et al.  On predicting social unrest using social media , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Gilad Mishne,et al.  Leave a Reply: An Analysis of Weblog Comments , 2006 .

[30]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[31]  Jim Gray,et al.  2020 Computing: Science in an exponential world , 2006, Nature.

[32]  Naren Ramakrishnan,et al.  Tracking Multiple Social Media for Stock Market Event Prediction , 2017, ICDM.

[33]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[34]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[35]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[36]  Zoran Obradovic,et al.  Stay on Topic, Please: Aligning User Comments to the Content of a News Article , 2021, ECIR.

[37]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[38]  P. Gill,et al.  Breaking down language barriers , 1998, BMJ.