The Japanese Government Project for Machine Translation

The project is funded by a grant from the Agency of Science and Technology through the Special Coordination Funds for the Promotion of Science and Technology, and was started in fiscal 1982. The formal title of the project is "Research on Fast Information Services between Japanese and English for Scientific and Engineering Literature". The purpose is to demonstrate the feasibility of machine translation of abstracts of scientific and engineering papers between the two languages, and as a result, to establish a fast information exchange system for these papers. The project term was initially scheduled as three years from the fiscal year of 1982 with a budget of about seven hundred million yen, but, due to the present financial pressures on the government, the term has been extended to four years, up to 1986. The project is conducted by the close cooperation between four organizations. At Kyoto University, we have the responsibility of developing the software system for the core part of the machine translation process (grammar writing system and execution system); grammar systems for analysis, transfer and synthesis; detailed specification of what information is written in the word dictionaries (all the parts of speech in the analysis, transfer, and generation dictionaries), and the working manuals for constructing these dictionaries. The Electrotechnical Laboratories (ETL) are responsible for the machine translation text input and output, morphological analysis and synthesis, and the construction of the verb and adjective dictionaries based on the working manuals prepared at Kyoto. The Japan Information Center for Science and Technology (JICST) is in charge of the noun dictionary and the compiling of special technical terms in scientific and technical fields. The Research Information Processing System (RIPS) under the Agency of Engineer. # . mg Technology is responsible for completing the machine translation system, including the man-machine interfaces to the system developed at Kyoto, which allow preand post-editing, access to grammar rules, and dictionary maintenance. The project is not primarily concerned with the development of a final practical system; that will be developed by private industry using the results of this project. Technical know-how is already being transferred gradually to private enterprise through the participation in the project of people from industry. Software and linguistic data are also being transferred in part. Finally, complete technical transfer will be done under the proper conditions. The Japanese source texts being used are abstracts of scientific and technical papers published in the monthly JICST journal d Current Bibliography of Science and Technology. At present, the project is only processing texts in the electronics, electrical engineering, and computer science fields. English source texts will be abstracts from INSPEC in these f ields. . The sentence structures used in abstracts tend .to be complex compared to ordinary sentences, with long nominal compounds, noun-phrase conjunctions, mathematical and physical formulas, long embedded sentences, and so on. The analysis and translation of this type of sentence structure is far more difficult than ordinary sentence patterns. However, we have not included a pre-editing stage because we wanted to find the ultimate limitations on handling this type of complex sentence structure. Our system is based on the following concepts: 1. The use of all available linguistic information, both surface and syntactic. The writing of as detailed as possible syntactic rules. The development of a grammar writing system that can accept any future level of sophisticated linguistic theory. 2. The introduction of semantic information wherever necessary to enable the syntactic analysis to be as accurate as possible. The importance of semantic information not over-estimated; a well-balanced usage of both syntax and semantics. Heavily seman-