Coreference, cross-document coreference, and information extraction methodologies
暂无分享,去创建一个
Much work has been done in the field of Natural Language Processing in the last decade, especially in the areas of information extraction and text (document) retrieval. Numerous systems have been developed, each using its own techniques and theories for processing text. The explosive growth of the Internet and the amount of information available on the information super-highway has created large collections of free text that is easily and readily available to a large number of people. This has created opportunities for computers to play an increasingly important role in processing this large collection of text. This phenomenal growth in the amount of information available has also given the impetus to most of the current areas of research in Natural Language Processing.
This dissertation presents several Natural Language Processing (NLP) tools, both theoretical and practical, that further the research already done. In particular, the work described here involves systems for information extraction (IE), information retrieval (IR), document summarization, named entity identification (NE), word sense disambiguation (WSD), coreferencing, and cross-document coreferencing. In addition to building these systems, the research has also been focussed on building models for analyzing the complexities of various NLP tasks, and on using the models for analyzing the performance of systems on such tasks.