Developing and Orchestrating a Portfolio of Natural Legal Language Processing and Document Curation Services

We present a portfolio of natural legal language processing and document curation services currently under development in a collaborative European project. First, we give an overview of the project and the different use cases, while, in the main part of the article, we focus upon the 13 different processing services that are being deployed in different prototype applications using a flexible and scalable microservices architecture. Their orchestration is operationalised using a content and document curation workflow manager.

[1]  Manaal Faruqui,et al.  Training and Evaluating a German Named Entity Recognizer with Semantic Generalization , 2010, KONVENS.

[2]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[3]  Giulia Venturi,et al.  5. Semantic Processing Of Legal Texts , 2018 .

[4]  Livio Robaldo,et al.  NLP Challenges for Eunomos a Tool to Build and Manage Legal Knowledge , 2012, LREC.

[5]  Giovanni Semeraro,et al.  Centroid-based Text Summarization through Compositionality of Word Embeddings , 2017, MultiLing@EACL.

[6]  Bourgonje Peter,et al.  Processing Document Collections to Automatically Extract Linked Data: Semantic Storytelling Technologies for Smart Curation Workflows , 2016 .

[7]  Wael Hassan Gomaa,et al.  A Survey of Text Similarity Approaches , 2013 .

[8]  Gijs van Dijck,et al.  Answering Legal Research Questions About Dutch Case Law with Network Analysis and Visualization , 2017, JURIX.

[9]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  G. Rehm,et al.  Towards a Workflow Manager for Curation Technologies in the Legal Domain , 2018 .

[12]  Jing He,et al.  Different Types of Automated and Semi-automated Semantic Storytelling: Curation Technologies for Different Sectors , 2017, GSCL.

[13]  Artem Revenko,et al.  Discrimination of Word Senses with Hypernyms , 2017, LD4IE@ISWC.

[14]  Krys J. Kochut,et al.  Text Summarization Techniques: A Brief Survey , 2017, International Journal of Advanced Computer Science and Applications.

[15]  Felix Sasaki,et al.  Towards a Platform for Curation Technologies: Enriching Text Collections with a Semantic-Web Layer , 2016, ESWC.

[16]  Demian Gholipour Ghalandari Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization , 2017, NFiS@EMNLP.

[17]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[18]  Georg Rehm,et al.  Curation Technologies for the Construction and Utilisation of Legal Knowledge Graphs , 2018 .

[19]  A. Wyner,et al.  Legal Knowledge and Information Systems JURIX 2017: The Thirtieth Annual Conference , 2017 .

[20]  Georges Span LITES: An intelligent tutoring system shell for legal education , 1994 .

[21]  Heng Ji,et al.  CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases , 2016, WWW.

[22]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[23]  Sebastian Padó,et al.  A Named Entity Recognition Shootout for German , 2018, ACL.

[24]  Floris Bex,et al.  Legal Knowledge and Information Systems. JURIX 2006 , 2006 .

[25]  Christian Biemann,et al.  GermaNER: Free Open German Named Entity Recognition Tool , 2015, GSCL.

[26]  Alex A. Freitas,et al.  Document Clustering and Text Summarization , 2000 .

[27]  Iryna Gurevych,et al.  Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.

[28]  Matthew Gifford LexrideLaw: an argument based legal search engine , 2017, ICAIL.

[29]  Nikola Ljubešić,et al.  Term Extraction, Tagging, and Mapping Tools for Under-Resourced Languages , 2012 .

[30]  Jens Lehmann,et al.  Integrating NLP Using Linked Data , 2013, SEMWEB.

[31]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[32]  Iryna Gurevych,et al.  UKP: Computing Semantic Textual Similarity by Combining Multiple Content Similarity Measures , 2012, *SEMEVAL.

[33]  Michael Gertz,et al.  HeidelTime: High Quality Rule-Based Extraction and Normalization of Temporal Expressions , 2010, *SEMEVAL.

[34]  Christian Biemann,et al.  GermEval 2014 Named Entity Recognition Shared Task , 2014 .

[35]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.