Keyword extraction method and device
暂无分享,去创建一个
The invention discloses a keyword extraction method and device. The method comprises the following steps: providing corpus data in a field, wherein the corpus data comprise a plurality of documents; pre-processing the corpus data and obtaining text data; performing segmentation on the text data and obtaining a plurality of corpus words; performing a filtration treatment on the corpus words and obtaining a plurality of candidate words; setting an initial weighted value for each candidate word; adjusting the initial weighted value of the candidate word according to a cooccurrence relation of the candidate word in each document, and obtaining a final weighted value of the candidate word in each document; and determining a keyword of each document according to the final weighted value. By using the technical scheme of the invention, keywords of the corpus in a certain field can be accurately extracted.
[1] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.