Towards Interoperabilty of European Language Resources

A core component of the European Union is a common market with a single information space that works with around two dozen national languages and many regional languages. This wide variety of languages presents linguistic barriers that can severely limit the free flow of goods, information and services throughout Europe.In this article, we provide an overview of the META-NET Network of Excellence [1]. This is an ambitious initiative, consisting of 44 centres from 31 countries in Europe, aiming to improve significantly on the number of language technologies that can assist European citizens, by supporting enhanced communication and co-operation across languages. A major outcome of the project will be META-SHARE, a searchable network of repositories that collect together resources such as language data, tools and related Web services, covering a large number of European languages. The resources within these repositories are intended to facilitate the development and evaluation of a wide range of new language processing applications and services.Various new applications can be built by combining together existing resources in different ways. This process can be helped greatly by ensuring that individual resources are interoperable, i.e., that they can be combined together with little or no configuration. The UIMA (Unstructured Information Management Architecture) framework [2][3] is concerned specifically with ensuring the interoperability of resources, and the U-Compare platform [4][5][6], built on top of UIMA, is designed especially to facilitate the rapid construction and evaluation of natural language- processing/text-mining applications using interoperable resources, without the need for any additional programming. U-Compare comes together with a library of resources of several different types. As part of META-NET, this library will be extended to cover a number of different European languages. The functionality of U-Compare will also be enhanced to allow the use of multi-lingual and cross-lingual components, such as those that carry out automatic machine translation. By integrating and showcasing the functionality of U-Compare within META-SHARE, it is intended to demonstrate that META-SHARE can serve not only as a useful tool to locate language resources for a range of languages, but also act as an integrated environment that allows for rapid prototyping and testing of applications that make use of these resources.