Stanford's Distantly-Supervised Slot-Filling System

This paper describes the design and implementation of the slot filling system prepared by Stanford’s natural language processing group for the 2011 Knowledge Base Population (KBP) track at the Text Analysis Conference (TAC). Our system relies on a simple distant supervision approach using mainly resources furnished by the track’s organizers: we used slot examples from the provided knowledge base, which we mapped to documents from several corpora: those distributed by the organizers, Wikipedia, and web snippets. This system is a descendant of Stanford’s system from last year, with several improvements: an inference process that allows for multi-label predictions and uses worldknowledge to validate outputs; model combination; and a tighter integration of entity coreference and web snippets in the training process. Our submissions scored 16 F1 points using web snippets and 13.5 F1 without web snippets (both scores are higher than the median score of 12.7 F1). We also describe our temporal slot filling system, which achieved 37.0 F1 on the diagnostics temporal task on the developmental queries.