Research on Summary Sentences Extraction Oriented to Live Sports Text

In order to enable automatic generation of sports news, in this paper, we propose an extraction method to extract summary sentences from live sports text. After analyzing the characteristics of live sports text, we regard extraction of summary sentence as the sequence tagging problem, and decide to use Conditional Random Fields (CRFs) as the extraction model. Firstly, we expend the correlated words of keywords using word2vec. Then, we select positive correlated words, negative correlated words, time and the window of score changes as features to train the model and extract summary sentences. This method get good results on the evaluation indicators of ROUGE-1, GOUGE-2 and ROUGE-SU4. And it shows that this method has a meaningful influence on automatic summarization and automatic generation of sports news.