Implementation of a Workflow Management System for Non-Expert Users

In the Danish CLARIN-DK infrastructure, chaining language technology (LT) tools into a workflow is easy even for a non-expert user, because she only needs to specify the input and the desired output of the workflow. With this information and the registered input and output profiles of the available tools, the CLARIN-DK workflow management system (WMS) computes combinations of tools that will give the desired result. This advanced functionality was originally not envisaged, but came within reach by writing the WMS partly in Java and partly in a programming language for symbolic computation, Bracmat. Handling LT tool profiles, including the computation of workflows, is easier with Bracmat’s language constructs for tree pattern matching and tree construction than with the language constructs offered by mainstream programming languages.

[1]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[2]  Reinhold Heckmann,et al.  A Functional Language for the Specification of Complex Tree Transformations , 1988, ESOP.

[3]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[4]  Yong Suk Choi TPEMatcher: A tool for searching in parsed text corpora , 2011, Knowl. Based Syst..

[5]  Patrick Lincoln,et al.  Maude as a metalanguage , 1998, WRLA.

[6]  Eric Breck zymake: A Computational Workflow System for Machine Learning and Natural Language Processing , 2008, SETQALNLP.

[7]  Tijs van der Storm,et al.  RASCAL: A Domain Specific Language for Source Code Analysis and Manipulation , 2009, 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation.

[8]  Christophe Ringeissen,et al.  A Pattern Matching Compiler for Multiple Target Languages , 2003, CC.

[9]  Dipti Misra Sharma,et al.  Kathaa: A Visual Programming Framework for NLP Applications , 2016, HLT-NAACL Demos.

[10]  Bart Jongejan,et al.  Anonymization of Court Orders , 2016, 2016 11th Iberian Conference on Information Systems and Technologies (CISTI).

[11]  James R. Slagle,et al.  Automated Theorem-Proving for Theories with Simplifiers Commutativity, and Associativity , 1974, JACM.

[12]  Daniel Crawl,et al.  Natural Language Processing using Kepler Workflow System: First Steps , 2016, ICCS.

[13]  I. V. Ramakrishnan,et al.  Nonlinear Pattern Matching in Trees , 1988, ICALP.

[14]  David A. Ferrucci,et al.  Building an example application with the Unstructured Information Management Architecture , 2004, IBM Syst. J..

[15]  Frank Van Eynde,et al.  Large Scale Syntactic Annotation of Written Dutch: Lassy , 2013, Essential Speech and Language Technology for Dutch.

[16]  David J. Farber,et al.  SNOBOL , A String Manipulation Language , 1964, JACM.

[17]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[18]  David R. Hanson,et al.  An Alternative to the Use of Patterns in String Processing , 1980, TOPL.

[19]  Roger Levy,et al.  Tregex and Tsurgeon: tools for querying and manipulating tree data structures , 2006, LREC.

[20]  Victor H. Yngve,et al.  A programming language for mechanical translation , 1958, Mech. Transl. Comput. Linguistics.

[21]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[22]  Erhard W. Hinrichs,et al.  WebLicht: Web-Based LRT Services for German , 2010, ACL.

[23]  Bart Jongejan Workflow Management in CLARIN-DK , 2013 .

[24]  Eelco Visser,et al.  Stratego: A Language for Program Transformation Based on Rewriting Strategies , 2001, RTA.

[25]  Hélène Kirchner,et al.  ELAN: A logical framework based on computational systems , 1996, WRLA.

[26]  Bart Jongejan,et al.  MULTIMODAL COMMUNICATION IN VIRTUAL ENVIRONMENTS , 2005 .

[27]  Heike Neuroth,et al.  TextGrid - Virtual Research Environment for the Humanities , 2011, Int. J. Digit. Curation.