An Attempt To Computerized Dictionary Data Bases

Two dictionary data base systems developed at Kyoto University are presented in this paper. One is the system for a Japanese dictionary ( Shinmeikai Kokugojiten, published by Sansei-do) and the other is for an English-Japanese dictionary (New Concise English-Japanese Dictionary, also published by Sansei-do). Both are medium size dictionaries which contain about 60,000 lexical items. The topics discussed in this paper are divided into two sub-topics. The first topic is about data translation problem of large, unformatted linguistic data. Up to now, no serious attempts have been made to this problem, though several systems have been proposed to translate data in a certain format into another. A universal data translator/verifier, called DTV, has been developed and used for data translation of the two dictionaries. The detailed construction of DTV will be given. The other sub-topic is about the problem of data organization which is appropriate for dictionaries. It is emphasized that the distinction between 'external structures' and 'internal structures' is important in a dictionary system. Though the external structures can be easily managed by general DBMS's, the internal (or linguistic) structures cannot be well manipulated. Some additional, linguistic oriented operations should be incorprated in dictionary data base systems with universal DBMS operations. Some examples of applications of the dictionary systems will also be given.