Corpus construction for topic-based summarization of multi-party conversation

In this paper, we report corpus construction and topic-based summarization methods for multi-party conversation. We have already constructed reference summaries and a list of important utterances in each discussion. However, fine-grained summaries about topics in a discussion often are desired in many situations. Therefore, we construct topic-based summaries and propose an important utterance extraction method and two summarization processes using the extracted utterances; extractive and abstractive methods. For the important utterance extraction, we use SVMs with 12 types of features. We use mBART, which is a neural network-based model, as the abstractive method. In the experiment, the extractive method was superior in terms of “accuracy as a summary (relevance), “ while the readability of the abstractive method was superior.