An Assessment of Cyc for Natural Language Processing

This is the final report on the assessment of Cyc for natural language processing applications. The work reported here was carried out by the authors at CRL, NMSU under collaboration with both the Department of Defense and Cycorp, Inc. The primary motivation of this relatively small-scale exercise was to arrive at an independent assessment of the utility of Cyc’s knowledge and inference capabilities for solving difficult problems in NLP and machine translation. Word sense disambiguation and coreference resolution were chosen as the two problems for this study. We conclude from this exercise that Cyc in fact has a large amount of knowledge that is potentially useful for solving these problems in NLP. However, the knowledge in Cyc is not directly applicable to the problems either in an exclusively Cyc-based solution or one where Cyc is used to improve the performance of other methods. In this report, we have attempted to identify the primary reasons why Cyc cannot readily solve NLP problems, to illustrate our findings with many real-world examples, and to suggest changes or enhancements to Cyc that might make its knowledge more readily applicable to NLP problems.