User information extraction in big data environment
暂无分享,去创建一个
In the era of large data, massive unstructured data contains a wealth of knowledge, and relying on artificial to find this knowledge is unrealistic, so we study the method of extracting attributes and attribute value automatically from unstructured text. We use the structured information box of the Chinese interactive encyclopedia to extract the relationship triples for generating the relationship knowledge base, and then use the relationship knowledge base for the back annotation. The sentence including the tuple is added to training corpus. This method avoids the manual annotation and solves the problem of insufficient training corpus effectively, which is proven by some experiments.
[1] Jie Tang,et al. Information Extraction: Methodologies and Applications , 2008 .
[2] Tom M. Mitchell,et al. Weakly Supervised Extraction of Computer Security Events from Twitter , 2015, WWW.
[3] Mark Craven,et al. Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.