Uygurcadan Türkçeye Bilgisayarlı Çeviri
暂无分享,去创建一个
Machine translation is a sub-field of Natural Language Processing which belongs to Artificial Intelligence. Generally, it is based on computer technology that uses software to translate one natural language to another. In the 1950s, the Georgetown experiment involved fully-automatic translation of over sixty Russian sentences into English (Hutchins, 2004) . The experiment was a great success and ushered in an era of substantial funding for machine-translation research. One of the main projects initiated by the US at that time was a machine translation system which converted Russian to English. This project continued from 1950 to 1960. In 1964, government sponsors of machine translation in the United States formed the Automatic Language Processing Advisory Committee (ALPAC) to examine the project's potential. In the famous 1966 report, ALPAC concluded that machine translation was slower, less accurate and twice as expensive as human translation, and that "there is no immediate or predictable prospect of useful machine translation" (Hutchins, 1995) . The effects of this report brought about the virtual end to machine translation research in the US for over a decade after its publication . As computer technology developed, high capacity and high speed computers were produced. Thus, the main restrictions of studying natural language were removed and machine translation gained the attention of the computer science community once again. Despite technologic advances and the advent of new methods, a general purpose for full automatic machine translation systems still does not exist. To date, few machine translation systems have been developed, furthermore, they may only be applied to restricted texts and some post-editing works (usually necessary after initial translations). The main reasons for these are the morphological, syntactical and lexical differences between different languages. In conclusion, translated texts remain inferior to higher quality translations. Recently, some machine translation systems designed for related languages, such as: Czech to Slovak, Spanish to Catalan, and Turkmen language to Turkish have been implemented; studies on them have proven successful translations can be produced efficiently. In this study, our aim was to implement a machine translation system between Uyghur language and Turkish. Uyghur language is an agglutinative language such as other Turkic languages (i.e. Turkmen, Kazakh, Kyrgyz, Uzbek and Azeri etc.). All Turkic languages belong to the Ural-Altaic language family and are characteristically agglutinative languages which have productive inflectional and derivational morphology. Most research about natural language processing and machine translation of Turkic languages focus on Turkish language. Mainly due to the fact that there is active ongoing research on the subject in Turkey, and they continue to produce valuable results. To date, machine translation systems implemented between Turkic languages has been scant, such as: Turkish to Azeri, Turkish to Crimean Tatar, Turkmen language to Turkish etc. Unfortunately, little computational research about Uygur languages exists. Turkic languages tend to have similar morphological structure and share some common word roots. The main shared properties include similar word order and syntactic structure. However, distinctions exist which prevent mutual intelligibility between these languages. In order to implement this translation system, we utilized a frame-work which is favored for translation between closely related agglutinative languages. Thus, we implemented a morphological analyzer for Uyghur language with XEROX's Finite State Transducers (FST) tools. In this morphological analyzer we considered general cases for Uyghur languages and tagged Uyghur words with the same tags that were used for tagging other Turkic languages words. Thus, it will be easy to integrate this system to other Turkic languages. In order to improve the system's performance, we implemented a rule based morphological disambiguator, additionally, a disambiguator for word senses. We have evaluated our system's performance using BLEU scores for 240 differently structured sentences. As a result, a system has been determined which may successfully translate intermediate level Uyghur language into Turkish. Keywords: Machine translation , Turkic languages, Uyghur language, Turkish .