University of Houston at CL-SciSumm 2016: SVMs with tree kernels and Sentence Similarity

This paper describes the University of Houston team’s efforts toward the problem of identifying reference spans in a reference document given sentences from other documents that cite the reference document. We investigated the following approaches: cosine similarity with multiple incremental modifications and SVMs with a tree kernel. Although the best performing approach in our experiments is quite simple, it is not the best under every metric used for comparison. We also present a brief analysis of the dataset which includes information on its sparsity and frequency of section titles.