A large amount of texts recorded in Chinese exist in power grid enterprises. These texts contain abundant information of power system. Manually mining the text information is inefficient and the accuracy may vary with different dispatchers. In this paper, the power fault countermeasure text is taken as the object to study the power Chinese text information extraction method. Power texts are segmented firstly based on the nature language process (NLP), the ontology lexicon is established according to the power word attribute in the power fault countermeasure text; Based on the syntax structure characteristics of punctuations and the concept of separate parsing phrase are brought in to guide the division of long texts, which can separate the sentence with only one power entity and its related information; The syntax rule template applicable to the separate parsing phrase is established based on the meta-character templates (generalization slot, fixed word-combination, wildcard character, and registry function) used for the power fault preplan text information extraction and the structured output of that information; At last, the generalization ability and the universality of the template are analyzed. Examples show that the rule template applies to the information extraction of most texts with strong universality and high accuracy.
[1]
Erik Cambria,et al.
Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article]
,
2014,
IEEE Computational Intelligence Magazine.
[2]
Li Li,et al.
Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records
,
2016,
Scientific Reports.
[3]
Mi-Young Kim,et al.
Segmentation of Chinese Long Sentences Using Commas
,
2004,
SIGHAN@ACL.
[4]
Nuno J. Mamede,et al.
Advances in Natural Language Processing: Third International Conference, PorTAL 2002, Faro, Portugal, June 23-26, 2002. Proceedings
,
2002
.
[5]
Waqar Mahmood,et al.
UMagic! THE UML Modeler for Text Documents
,
2013
.
[6]
K. D. Srivastava,et al.
Review of condition assessment of power transformers in service
,
2002
.
[7]
Ying Li,et al.
Structuralization of Digestive Endoscopic Report Based on NLP
,
2008,
2008 International Conference on BioMedical Engineering and Informatics.
[8]
Mohamed Gaha,et al.
An Ontology-Based Reasoning Approach for Electric Power Utilities
,
2013,
RR.