论文信息 - Interoperability = f(community, division of labour)

Interoperability = f(community, division of labour)

This paper aims to motivate the hypothesis that practical interoperability can be seen as a function of whether and how stakeholder communities duplicate or divide work in a given area or market. We focus on the area of language processing which traditionally produces many diverse tools that are not immediately interoperable. However, there is also a strong desire to combine these tools into processing pipelines and to apply these to a wide range of different corpora. The space opened between generic, inherently "empty" interoperability frameworks that offer no NLP capabilities themselves and dedicated NLP tools gave rise to a new class of NLP-related projects that focus specifically on interoperability: component collections. This new class of projects drives interoperability in a very pragmatic way that could well be more successful than, e.g., past efforts towards standardised formats which ultimately saw little adoption or support by software tools.

Richard Eckart de Castilho

[1] Martin Reynaert,et al. FoLiA: A practical XML Format for Linguistic Annotation - a descriptive and comparative study , 2014, CLIN 2014.

[2] Jens Lehmann,et al. Integrating NLP Using Linked Data , 2013, SEMWEB.

[3] K. Bretonnel Cohen,et al. U-Compare: A modular NLP workflow construction and evaluation system , 2011, IBM J. Res. Dev..

[4] Iryna Gurevych,et al. A broad-coverage collection of portable NLP components for building shareable analysis pipelines , 2014, OIAF4HLT@COLING.

[5] Laurent Romary,et al. International standard for a linguistic annotation framework , 2003, HLT-NAACL 2003.

[6] Yannick Versley,et al. BART: A Modular Toolkit for Coreference Resolution , 2008, ACL.

[7] David A. Ferrucci,et al. UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[8] Kalina Bontcheva,et al. Text Processing with GATE , 2011 .

[9] C. M. Sperberg-McQueen,et al. Guidelines for electronic text encoding and interchange , 1994 .

[10] Ewan Klein,et al. Natural Language Processing with Python , 2009 .

[11] Erhard W. Hinrichs,et al. WebLicht: Web-based LRT Services in a Distributed eScience Infrastructure , 2010, LREC.

[12] Nancy Ide,et al. GrAF: A Graph-based Format for Linguistic Annotations , 2007, LAW@ACL.

[13] Nancy Ide,et al. XCES: An XML-based Encoding Standard for Linguistic Corpora , 2000, LREC.

[14] Sunghwan Sohn,et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[15] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.