Power Networks Dialogs - Enhancing Domain-Specific TextProcessing Techniques and Resources
暂无分享,去创建一个
In this paper, we describe the process of development of the
analytical approaches adapted for the work with technical texts
specialized at the domain of electrical power networks (EPN)
topics. The process includes improving the quality of the EPN
resources. The new data represent one of the largest domain
specific corpora containing more than 5 million of text tokens.
We show the details of building a new the large domain-specific
corpus, its analysis and further processing such as filtering,
morphological and syntactical analysis and phrase detection and
present, how they help to improve the dialog system.