Providing machine tractable dictionary tools

Machine readable dictionaries (Mrds) contain knowledge about language and the world essential for tasks in natural language processing (Nlp). However, this knowledge, collected and recorded by lexicographers for human readers, is not presented in a manner for Mrds to be used directly for Nlp tasks. What is badly needed are machine tractable dictionaries (Mtds): Mrds transformed into a format usable for Nlp. This paper discusses three different but related large-scale computational methods to transform Mrds into Mtds. The Mrd used is The Longman Dictionary of Contemporary English (Ldoce). The three methods differ in the amount of knowledge they start with and the kinds of knowledge they provide. All require some handcoding of initial information but are largely automatic. Method I, a statistical approach, uses the least handcoding. It generates “relatedness” networks for words in Ldoce and presents a method for doing partial word sense disambiguation. Method II employs the most handcoding because it develops and builds lexical entries for a very carefully controlled defining vocabulary of 2,000 word senses (1,000 words). The payoff is that the method will provide an Mtd containing highly structured semantic information. Method III requires the handcoding of a grammar and the semantic patterns used by its parser, but not the handcoding of any lexical material. This is because the method builds up lexical material from sources wholly within Ldoce. The information extracted is a set of sources of information, individually weak, but which can be combined to give a strong and determinate linguistic data base.

[1]  Robert A. Amsler Computational lexicology: A research program , 1899 .

[2]  M R Quillian,et al.  Word concepts: a theory and simulation of some basic semantic capabilities. , 1967, Behavioral science.

[3]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[4]  Frank Harary,et al.  Graph Theory , 2016 .

[5]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[6]  Yorick Wilks,et al.  An artificial intelligence approach to machine translation. , 1972 .

[7]  Yorick Wilks,et al.  Grammar, meaning and the machine analysis of language , 1972 .

[8]  Yorick Wilks,et al.  An intelligent analyzer and understander of English , 1975, Commun. ACM.

[9]  Yorick Wilks,et al.  A Preferential, Pattern-Seeking, Semantics for Natural Language Inference , 1975, Artif. Intell..

[10]  Edward H. Shortliffe,et al.  Computer-based medical consultations, MYCIN , 1976 .

[11]  Robert Alfred Amsler The Structure of the Merriam-Webster Pocket Dictionary , 1980 .

[12]  Thomas G. Dietterich,et al.  Inductive Learning of Structural Descriptions: Evaluation Criteria and Comparative Review of Selected Methods , 1981, Artif. Intell..

[13]  Robert A. Amsler,et al.  A Taxonomy for English Nouns and Verbs , 1981, ACL.

[14]  Stephen Pulman,et al.  Generalised Phrase Structure Grammar‚ Earley's Algorithm‚ and the Minimisation of Recursion , 1983 .

[15]  Yorick Wilks,et al.  Preference Semantics, III-Formedness, and Metaphor , 1983, Am. J. Comput. Linguistics.

[16]  Ted Briscoe,et al.  Towards A Dictionary Support Environment For Realtime Parsing , 1985, EACL.

[17]  George A. Miller,et al.  Dictionaries of the Mind , 1985, ACL.

[18]  Jordan B. Pollack,et al.  Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation , 1988, Cogn. Sci..

[19]  Geoffrey Sampson,et al.  A Stochastic Approach to Parsing , 1986, COLING.

[20]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[21]  Karen Sparck Jones Synonymy and semantic classification , 1986 .

[22]  Martha W. Evens,et al.  Semantically Significant Patterns in Dictionary Definitions , 1986, ACL.

[23]  Branimir Boguraev The definitional power of words , 1987, TINLAP '87.

[24]  Judy Kegl,et al.  The Boundary Between Word Knowledge and World Knowledge , 1987, TINLAP.

[25]  Jerry R. Hobbs World knowledge and word meaning , 1987, TINLAP '87.

[26]  Karen Jensen,et al.  Disambiguating Prepositional Phrase Attachments by Using On-Line Dictionary Definitions , 1987, Comput. Linguistics.

[27]  Ted Briscoe,et al.  The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English , 1987, ACL.

[28]  Branimir Boguraev,et al.  Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE , 1987, CL.

[29]  Cheng-ming Guo Interactive Vocabulary Acquisition in XTRA , 1987, IJCAI.

[30]  Hiyan Alshawi,et al.  Processing Dictionary Definitions with Phrasal Pattern Hierarchies , 1987, CL.

[31]  Karen Jensen,et al.  A Semantic Expert Using an Online Standard Dictionary , 1987, IJCAI.

[32]  Dan Fass Collative semantics: a semantics for natural language processing , 1988 .

[33]  Douglas B. Lenat,et al.  On the thresholds of knowledge , 1987, Proceedings of the International Workshop on Artificial Intelligence for Industrial Applications.

[34]  Dan Fass Metonymy and metaphor: what's the difference , 1988, COLING.

[35]  Yorick Wilks,et al.  Machine Tractable Dictionaries as Tools and Resources for Natural Language Processing , 1988, COLING.

[36]  Yorick Wilks,et al.  A tractable machine dictionary as a resource for computational semantics , 1989 .

[37]  Roy J. Byrd Discovering Relationships among Word Senses , 1994 .