论文信息 - Knowledge Enhanced Sports Game Summarization - 字舞流文

Knowledge Enhanced Sports Game Summarization

Sports game summarization aims at generating sports news from live commentaries. However, existing datasets are all constructed through automated collection and cleaning processes, resulting in a lot of noise. Besides, current works neglect the knowledge gap between live commentaries and sports news, which limits the performance of sports game summarization. In this paper, we introduce K-SportsSum, a new dataset with two characteristics: (1) K-SportsSum collects a large amount of data from massive games. It has 7,854 commentary-news pairs. To improve the quality, KSportsSum employs a manual cleaning process; (2) Different from existing datasets, to narrow the knowledge gap, K-SportsSum further provides a large-scale knowledge corpus that contains the information of 523 sports teams and 14,724 sports players. Additionally, we also introduce a knowledge-enhanced summarizer that utilizes both live commentaries and the knowledge to generate sports news. Extensive experiments on K-SportsSum and SportsSum datasets show that our model achieves new state-ofthe-art performances. Qualitative analysis and human study further verify that our model generates more informative sports news.

Lei Zhao | Zhigang Chen | An Liu | Zhixu Li | Jianfeng Qu | Tingyi Zhang | Jiaan Wang | Duo Zheng | Lei Zhao | Zhixu Li | Zhigang Chen | Jiaan Wang | Duo Zheng | Jianfeng Qu | An Liu | Tingyi Zhang

[1] Li Yang,et al. ETC: Encoding Long and Structured Inputs in Transformers , 2020, EMNLP.

[2] Gina-Anne Levow,et al. The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[3] Xiaojun Wan,et al. Content Selection for Real-time Sports News Construction from Commentary Texts , 2017, INLG.

[4] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[5] Colin Raffel,et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2021, NAACL.

[6] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[7] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[8] Alexander M. Rush,et al. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[9] Han Ren,et al. Sports News Generation from Live Webcast Scripts Based on Rules and Templates , 2016, NLPCC/ICCPOL.

[10] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.

[11] Jianshe Zhou,et al. Research on Summary Sentences Extraction Oriented to Live Sports Text , 2016, NLPCC/ICCPOL.

[12] Mirella Lapata,et al. Sentence Centrality Revisited for Unsupervised Summarization , 2019, ACL.

[13] Qiang Yang,et al. SportsSum2.0: Generating High-Quality Sports News from Live Text Commentary , 2021, CIKM.

[14] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[15] Xiaojun Wan,et al. Overview of the NLPCC-ICCPOL 2016 Shared Task: Sports News Generation from Live Webcast Scripts , 2016, NLPCC/ICCPOL.

[16] Ani Nenkova,et al. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2016, NAACL 2016.

[17] Chen Li,et al. Generating Sports News from Live Commentary: A Chinese Dataset for Sports Game Summarization , 2020, AACL.

[18] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[19] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[20] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[21] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[22] Xipeng Qiu,et al. FLAT: Chinese NER Using Flat-Lattice Transformer , 2020, ACL.

[23] Jianshe Zhou,et al. Generate Football News from Live Webcast Scripts Based on Character-CNN with Five Strokes , 2020 .

[24] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[25] Xiaojun Wan,et al. Towards Constructing Sports News from Live Text Commentary , 2016, ACL.

[26] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..