Football News Generation from Chinese Live Webcast Script

Challenges exist in the field of sports news generation automatically from webcast that (1) finding hot events and sentences accurately; (2) organizing the selected sentences with highly readability. This paper proposes a framework to generate sports news automatically. First, to obtain accurate hot events and sentences, we design a neural network to predict the probabilities that each statement in live webcast script appears in the writing news, where the inputs of the neural network are weighed word vectors obtained from football keywords dictionary, and the outputs the similarity of statements in training live webcast script and sentences in training news. In this way, the “good” sentences selected from webcast contribute to the semi-finished sport news. To make the generated news to be possibly similar to human writing, we adopt idioms often appeared in football game to describe or summarize the games’ development or turns between the selected sentences, and come into being the final sport news. The proposed framework are validated on the training and test data set proved by “Sports News Generation from Live Webcast scripts” task of NLPCC 2016, the experiments show that the proposed method present good performance.