A Comprehensive NLP System for Modern Standard Arabic and Modern Hebrew

This paper presents a comprehensive NLP system by Melingo that has been recently developed for Arabic, based on Morfix™ - an operational formerly developed highly successful comprehensive Hebrew NLP system.The system discussed includes modules for morphological analysis, context sensitive lemmatization, vocalization, text-to-phoneme conversion, and syntactic-analysis-based prosody (intonation) model. It is employed in applications such as full text search, information retrieval, text categorization, textual data mining, online contextual dictionaries, filtering, and text-to-speech applications in the fields of telephony and accessibility and could serve as a handy accessory for non-fluent Arabic or Hebrew speakers.Modern Hebrew and Modern Standard Arabic share some unique Semitic linguistic characteristics. Yet up to now, the two languages have been handled separately in Natural Language Processing circles, both on the academic and on the applicative levels. This paper reviews the major similarities and the minor dissimilarities between Modern Hebrew and Modern Standard Arabic from the NLP standpoint, and emphasizes the benefit of developing and maintaining a unified system for both languages.