论文信息 - nagoy Team's Summarization System at the NTCIR-14 QA Lab-PoliInfo

nagoy Team's Summarization System at the NTCIR-14 QA Lab-PoliInfo

The nagoy team participated in the NTCIR-14 QA Lab-PoliInfo’s summarization subtask. This paper describes our summarization system for assembly member speeches using random forest classifiers. Since we encountered an imbalance in the data, we were unable to achieve good results in this subtask when training on all data. To solve this problem, we developed a new summarization system that applies multiple random forest classifiers training on different-sized data sets step by step. As a result, our system achieved good performance, especially in the evaluation by ROUGE scores. In this paper, we also compare our system with a single random forest classifier using probability.

Yasuhiro Ogawa | Katsuhiko Toyama | Takahiro Komamizu | Michiaki Satou

[1] Yuji Matsumoto,et al. Extracting Important Sentences with Support Vector Machines , 2002, COLING.

[2] Yuji Matsumoto,et al. Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[3] Noriko Kando,et al. Final Report of the NTCIR-14 QA Lab-PoliInfo Task , 2019, NTCIR.

[4] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[5] Jade Goldstein-Stewart,et al. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[6] Yasuhiro Ogawa. Extracting Important Sentences with Random Forest for Statute Summarization , 2019 .

[7] Leo Breiman,et al. Random Forests , 2001, Machine Learning.