conTEXT - Lightweight Text Analytics Using Linked Data

The Web democratized publishing – everybody can easily publish information on a Website, Blog, in social networks or microblogging systems. The more the amount of published information grows, the more important are technologies for accessing, analysing, summarising and visualising information. While substantial progress has been made in the last years in each of these areas individually, we argue, that only the intelligent combination of approaches will make this progress truly useful and leverage further synergies between techniques. In this paper we develop a text analytics architecture of participation, which allows ordinary people to use sophisticated NLP techniques for analysing and visualizing their content, be it a Blog, Twitter feed, Website or article collection. The architecture comprises interfaces for information access, natural language processing and visualization. Different exchangeable components can be plugged into this architecture, making it easy to tailor for individual needs. We evaluate the usefulness of our approach by comparing both the effectiveness and efficiency of end users within a task-solving setting. Moreover, we evaluate the usability of our approach using a questionnaire-driven approach. Both evaluations suggest that ordinary Web users are empowered to analyse their data and perform tasks, which were previously out of reach.

[1]  Johanna Völker,et al.  Deployment of RDFa, Microdata, and Microformats on the Web - A Quantitative Analysis , 2013, International Semantic Web Conference.

[2]  David R. Karger,et al.  Exhibit: lightweight structured data publishing , 2007, WWW '07.

[3]  Jens Lehmann,et al.  Integrating NLP Using Linked Data , 2013, SEMWEB.

[4]  Felix Jungermann,et al.  Information Extraction with RapidMiner , 2015 .

[5]  Kalina Bontcheva,et al.  Text Processing with GATE , 2011 .

[6]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[7]  Aba-Sah Dadzie,et al.  Approaches to visualising Linked Data: A survey , 2011, Semantic Web.

[8]  Jeffrey Heer,et al.  Wrangler: interactive visual specification of data transformation scripts , 2011, CHI.

[9]  Axel-Cyrille Ngonga Ngomo,et al.  SCMS - Semantifying Content Management Systems , 2011, SEMWEB.

[10]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[11]  Huahai Yang,et al.  I can do text analytics!: designing development tools for novice developers , 2013, CHI.

[12]  Ali Khalili,et al.  The RDFa Content Editor - From WYSIWYG to WYSIWYM , 2012, 2012 IEEE 36th Annual Computer Software and Applications Conference.

[13]  D. Gerber,et al.  Bootstrapping the Linked Data Web , 2011 .

[14]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[15]  Jeff Sauro,et al.  The Factor Structure of the System Usability Scale , 2009, HCI.

[16]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[17]  Trevor Cohn,et al.  Trendminer: An Architecture for Real Time Analysis of Social Media Text , 2012, ICWSM 2012.

[18]  Ali Khalili,et al.  WYSIWYM Authoring of Structured Content Based on Schema.org , 2013, WISE.

[19]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[20]  Eric A. Brewer,et al.  Intel Mash Maker: join the web , 2007, SGMD.

[21]  Sriram Subramanian,et al.  Talking about tactile experiences , 2013, CHI.