Toward Interactive Visualization of Results from Domain-Specific Text Analytics

In big data analytics, visualization and access are central for the creation of knowledge and value from data. Interactive visualizations of analysis of structured data are commonplace. In this paper, information visualization and interaction for text analysis are addressed. The paper motivates this issue from a data usage standpoint, gives a survey of approaches in the area of interactive visualization of text analytics, and presents our proposal of a specific solution design for visual interaction with results from a combination of named entity recognition (NER) and text categorization (TC). This matrix-based model illustrates abstract views on complex relationships between abstract entities and is exemplary for any combination of feature extraction and TC. The aim of our proposal is to support feature extraction and TC researchers in distributed virtual research environments by providing intuitive visual interfaces.

[1]  William Ribarsky,et al.  LeadLine: Interactive visual analysis of text data through event identification and exploration , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[2]  Dilpreet Singh,et al.  A survey on platforms for big data analytics , 2014, Journal of Big Data.

[3]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[4]  Krisztian Balog,et al.  Overview of the TREC 2010 Entity Track , 2010, TREC.

[5]  Pasquale Pagano,et al.  Virtual Research Environments: An Overview and a Research Agenda , 2013, Data Sci. J..

[6]  John Stasko,et al.  Jigsaw: supporting investigative analysis through interactive visualization , 2008 .

[7]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[8]  Paul Walsh,et al.  IVIS4BigData: A Reference Model for Advanced Visual Interfaces Supporting Big Data Analysis in Virtual Research Environments , 2016, BDA@AVI.

[9]  Michael Fuchs,et al.  Towards Cloud-Based Knowledge Capturing Based on Natural Language Processing , 2015, Cloud Forward.

[10]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Matthias L. Hemmje,et al.  Toward Cloud-based Classification and Annotation Support , 2016, CLOSER.

[13]  Peder Olesen Larsen,et al.  The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index , 2010, Scientometrics.

[14]  Qiang Zhang,et al.  TIARA: a visual exploratory text analytic system , 2010, KDD '10.

[15]  Michael Fuchs,et al.  HOLACONF - Cloud Forward: From Distributed to Complete Computing Towards Cloud-Based Knowledge Capturing Based on Natural Language Processing , 2015 .

[16]  Jeffrey Heer,et al.  Termite: visualization techniques for assessing textual topic models , 2012, AVI.

[17]  Cees T. A. M. de Laat,et al.  Addressing big data issues in Scientific Data Infrastructure , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).