PALRACE: Reading Comprehension Dataset with Human Data and Labeled Rationales

Pre-trained language models achieves high performance on machine reading comprehension (MRC) tasks but the results are hard to explain. An appealing approach to make models explainable is to provide rationales for its decision. To facilitate supervised learning of human rationales, here we present PALRACE (Pruned And Labeled RACE), a new MRC dataset with human labeled rationales for 800 passages selected from the RACE dataset. We further classified the question to each passage into 6 types. Each passage was read by at least 26 participants, who labeled their rationales to answer the question. Besides, we conducted a rationale evaluation session in which participants were asked to answering the question solely based on labeled rationales, confirming that the labeled rationales were of high quality and can sufficiently support question answering.