Text Analytics: the convergence of Big Data and Artificial Intelligence

The analysis of the text content in emails, blogs, tweets, forums and other forms of textual communication constitutes what we call text analytics. Text analytics is applicable to most industries: it can help analyze millions of emails; you can analyze customers� comments and questions in forums; you can perform sentiment analysis using text analytics by measuring positive or negative perceptions of a company, brand, or product. Text Analytics has also been called text mining, and is a subcategory of the Natural Language Processing (NLP) field, which is one of the founding branches of Artificial Intelligence, back in the 1950s, when an interest in understanding text originally developed. Currently Text Analytics is often considered as the next step in Big Data analysis. Text Analytics has a number of subdivisions: Information Extraction, Named Entity Recognition, Semantic Web annotated domain�s representation, and many more. Several techniques are currently used and some of them have gained a lot of attention, such as Machine Learning, to show a semisupervised enhancement of systems, but they also present a number of limitations which make them not always the only or the best choice. We conclude with current and near future applications of Text Analytics.

[1]  Esteban Moro,et al.  Big Data Versus Small Data: The Case of ‘Gripe’ (Flu) in Spanish , 2015 .

[2]  Yusef Hassan-Montero,et al.  Improving Tag-Clouds as Visual Information Retrieval Interfaces , 2024, 2401.04947.

[3]  R Linsker,et al.  Perceptual neural organization: some approaches based on network models and information theory. , 1990, Annual review of neuroscience.

[4]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[5]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[6]  Xiaowei Wang,et al.  Use of NER Information for Improved Topic Tracking , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[7]  Marti A. Hearst Information Visualization for Text Analysis , 2009 .

[8]  Luis F. Chiroque,et al.  Graph-based Techniques for Topic Classification of Tweets in Spanish , 2014, Int. J. Interact. Multim. Artif. Intell..

[9]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[10]  Frank Schweitzer,et al.  Positive words carry less information than negative words , 2011, EPJ Data Science.

[11]  Peter I. Cowling,et al.  C-Link: Concept Linkage in Knowledge Repositories , 2010, AAAI Spring Symposium: Linked Data Meets Artificial Intelligence.

[12]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[13]  Lynette Hirschman,et al.  Natural language question answering: the view from here , 2001, Natural Language Engineering.

[14]  Petr Baudi,et al.  YodaQA: A Modular Question Answering System Pipeline , 2015 .

[15]  Jack G. Conrad,et al.  Legal document clustering with built-in topic segmentation , 2011, CIKM '11.

[16]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[17]  Lefteris Angelis,et al.  PuReD-MCL: a graph-based PubMed document clustering methodology , 2008, Bioinform..

[18]  Teófilo Redondo The Digital Economy: Social Interaction Technologies - an Overview , 2015, Int. J. Interact. Multim. Artif. Intell..

[19]  Han-Joon Kim,et al.  News Keyword Extraction for Topic Tracking , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[20]  D. Boyd,et al.  Six Provocations for Big Data , 2011 .

[21]  Lijun Jiang,et al.  Performing Text Categorization on Manifold , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.