Coreference Resolution without Span Representations

Since the introduction of deep pretrained language models, most task-specific NLP models were reduced to simple lightweight layers. An exception to this trend is the challenging task of coreference resolution, where a sophisticated end-to-end model is appended to a pretrained transformer encoder. While highly effective, the model has a very large memory footprint – primarily due to dynamicallyconstructed span and span-pair representations – which hinders the processing of complete documents and the ability to train on multiple instances in a single batch. We introduce a lightweight coreference model that removes the dependency on span representations, handcrafted features, and heuristics. Our model performs competitively with the current endto-end model, while being simpler and more efficient.