ROTEL: LINGUISTIC WEB SERVICES

According to Tim Berners-Lee, a web service provides for “program integration across application and organizational boundaries” [2]. The integration is achieved via a standardized RPC call interface implemented with SOAP which is essentially built over XML. The great advantage of the web service concept is that the machine hosting the actual service need not be the same with the machine of the user of the service. This is a concept borrowed from RPC and CORBA and extended such that the message transport can be implemented according to any Internet protocol such as HTTP or SMTP. One of the protocols for the message encoding is SOAP the latest version of which is 1.2. Also, a web service can be formally described using an XML language called WSDL. Together, SOAP and WSDL assure the user that the web service is readily available to use directly in any application (provided that the user knows the URL of the WSDL file describing the web service). Thus, the time spent to collect, adapt and test a standalone application is reduced to a minimum. The present article will describe several linguistic web services for English and Romanian developed during the CEEX ROTEL project, implementing NLP operations such as POS tagging (with its prerequisites sentence and token splitting), lemmatization, chunking, word linking, WordNet lookup, languages identification and diacritics insertion (for Romanian).