Providing machine tractable dictionary tools

Machine readable dictionaries (Mrds) contain knowledge about language and the world essential for tasks in natural language processing (Nlp). However, this knowledge, collected and recorded by lexicographers for human readers, is not presented in a manner for Mrds to be used directly for Nlp tasks. What is badly needed are machine tractable dictionaries (Mtds): Mrds transformed into a format usable for Nlp. This paper discusses three different but related large-scale computational methods to transform Mrds into Mtds. The Mrd used is The Longman Dictionary of Contemporary English (Ldoce). The three methods differ in the amount of knowledge they start with and the kinds of knowledge they provide. All require some handcoding of initial information but are largely automatic. Method I, a statistical approach, uses the least handcoding. It generates “relatedness” networks for words in Ldoce and presents a method for doing partial word sense disambiguation. Method II employs the most handcoding because it develops and builds lexical entries for a very carefully controlled defining vocabulary of 2,000 word senses (1,000 words). The payoff is that the method will provide an Mtd containing highly structured semantic information. Method III requires the handcoding of a grammar and the semantic patterns used by its parser, but not the handcoding of any lexical material. This is because the method builds up lexical material from sources wholly within Ldoce. The information extracted is a set of sources of information, individually weak, but which can be combined to give a strong and determinate linguistic data base.

[1]  Ted Briscoe,et al.  Book Reviews: Computational Lexicography for Natural Language Processing , 1990, CL.

[2]  D. Fass An Account of Coherence, Semantic Relations, Metonymy, and Lexical Ambiguity Resolution , 1988 .

[3]  R. Quirk A Grammar of contemporary English , 1974 .

[4]  Douglas B. Lenat,et al.  On the thresholds of knowledge , 1987, Proceedings of the International Workshop on Artificial Intelligence for Industrial Applications.

[5]  Roger W. Schvaneveldt,et al.  Using pathfinder to extract semantic information from text , 1990 .

[6]  Stephen Pulman,et al.  Generalised Phrase Structure Grammar‚ Earley's Algorithm‚ and the Minimisation of Recursion , 1983 .

[7]  Yorick Wilks,et al.  A Preferential, Pattern-Seeking, Semantics for Natural Language Inference , 1975, Artif. Intell..

[8]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[9]  Yorick Wilks,et al.  A tractable machine dictionary as a resource for computational semantics , 1989 .

[10]  Douglas B. Lenat,et al.  CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks , 1986, AI Mag..

[11]  Roger C. Schank,et al.  Computer Models of Thought and Language , 1974 .

[12]  Ted Briscoe,et al.  The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English , 1987, ACL.

[13]  Patrick Henry Winston,et al.  Learning structural descriptions from examples , 1970 .

[14]  Roger W. Schvaneveldt,et al.  Pathfinder associative networks: studies in knowledge organization , 1990 .

[15]  Yorick Wilks,et al.  Machine Tractable Dictionaries as Tools and Resources for Natural Language Processing , 1988, COLING.

[16]  Hiyan Alshawi,et al.  Processing Dictionary Definitions with Phrasal Pattern Hierarchies , 1987, CL.

[17]  Robert A. Amsler,et al.  Computational lexicology: a research program , 1899, AFIPS '82.

[18]  Cheng-ming Guo Interactive Vocabulary Acquisition in XTRA , 1987, IJCAI.

[19]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[20]  Geoffrey Sampson,et al.  A Stochastic Approach to Parsing , 1986, COLING.

[21]  E. Shortliffe Computer-based medical consultations: mycin (elsevier north holland , 1976 .

[22]  Yorick Wilks,et al.  Preference Semantics, III-Formedness, and Metaphor , 1983, Am. J. Comput. Linguistics.

[23]  Patrick Henry Winston,et al.  The psychology of computer vision , 1976, Pattern Recognit..

[24]  Karen Jensen,et al.  Disambiguating Prepositional Phrase Attachments by Using On-Line Dictionary Definitions , 1987, Comput. Linguistics.

[25]  Robert A. Amsler,et al.  A Taxonomy for English Nouns and Verbs , 1981, ACL.

[26]  Robert Alfred Amsler The Structure of the Merriam-Webster Pocket Dictionary , 1980 .

[27]  Yorick Wilks,et al.  Grammar, meaning and the machine analysis of language , 1972 .

[28]  Howard E Jacobson R & D Cooperation in AI: Report on the U.S. and Japanese panel, IJCAI 1985 , 1986 .

[29]  Karen Jensen,et al.  A Semantic Expert Using an Online Standard Dictionary , 1987, IJCAI.

[30]  Yorick Wilks,et al.  An artificial intelligence approach to machine translation. , 1972 .

[31]  Dan Fass Metonymy and metaphor: what's the difference , 1988, COLING.

[32]  A. Michiels,et al.  Exploiting a Large Data Base by Longman , 1980, COLING.

[33]  George A. Miller,et al.  Dictionaries of the Mind , 1985, ACL.

[34]  Frank Harary,et al.  Graph Theory , 2016 .

[35]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[36]  Jordan B. Pollack,et al.  Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation , 1988, Cogn. Sci..

[37]  Jerry R. Hobbs World knowledge and word meaning , 1987, TINLAP '87.

[38]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[39]  Martha W. Evens,et al.  Semantically Significant Patterns in Dictionary Definitions , 1986, ACL.

[40]  Martin Chodorow,et al.  Extracting Semantic Hierarchies from a Large On-Line Dictionary , 1985, ACL.

[41]  Edward H. Shortliffe,et al.  Chapter 3 – Consultation System , 1976 .

[42]  Judy Kegl,et al.  The Boundary Between Word Knowledge and World Knowledge , 1987, TINLAP.

[43]  Ted Briscoe,et al.  Towards A Dictionary Support Environment For Realtime Parsing , 1985, EACL.

[44]  Margaret Masterman,et al.  The thesaurus in syntax and semantics , 1957, Mech. Transl. Comput. Linguistics.

[45]  Branimir Boguraev The definitional power of words , 1987, TINLAP '87.

[46]  Dan Fass Collative semantics: a semantics for natural language processing , 1988 .

[47]  I. Anderson,et al.  Graphs and Networks , 1981, The Mathematical Gazette.

[48]  James Pustejovsky,et al.  The Acquisition of Conceptual Structure for the Lexicon , 1987, AAAI.

[49]  Yorick Wilks,et al.  Automatic Natural Language Parsing , 1985 .

[50]  Branimir Boguraev,et al.  Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE , 1987, CL.

[51]  Yorick Wilks,et al.  An intelligent analyzer and understander of English , 1975, Commun. ACM.

[52]  Yorick Wilks,et al.  Making Preferences More Active , 1978, Artif. Intell..

[53]  Paul Procter,et al.  Longman Dictionary of Contemporary English , 1978 .

[54]  Thomas G. Dietterich,et al.  Inductive Learning of Structural Descriptions: Evaluation Criteria and Comparative Review of Selected Methods , 1981, Artif. Intell..

[55]  Dan Fass Collative Semantics , 1986, COLING.

[56]  M R Quillian,et al.  Word concepts: a theory and simulation of some basic semantic capabilities. , 1967, Behavioral science.