nagoy Team's Summarization System at the NTCIR-14 QA Lab-PoliInfo

The nagoy team participated in the NTCIR-14 QA Lab-PoliInfo’s summarization subtask. This paper describes our summarization system for assembly member speeches using random forest classifiers. Since we encountered an imbalance in the data, we were unable to achieve good results in this subtask when training on all data. To solve this problem, we developed a new summarization system that applies multiple random forest classifiers training on different-sized data sets step by step. As a result, our system achieved good performance, especially in the evaluation by ROUGE scores. In this paper, we also compare our system with a single random forest classifier using probability.