Stanford at TAC KBP 2016: Sealing Pipeline Leaks and Understanding Chinese

We describe Stanford’s entries in the TAC KBP 2016 Cold Start Slot Filling and Knowledge Base Population challenge. Our biggest contribution is an entirely new Chinese entity detection and relation extraction system for the new Chinese and cross-lingual relation extraction tracks. This new system consists of several ruled-based relation extractors and a distantly supervised extractor. We also analyze errors produced by our existing mature English KBP system, which leads to several fixes, notably improvements to our patternsbased extractor and neural network model, support for nested mentions and inferred relations. Stanford’s 2016 English, Chinese and cross-lingual submissions achieved an overall (macro-averaged LDC-MEAN) F1 of 22.0, 14.2, and 11.2 respectively on the 2016 evaluation data, performing well above the median entries, at 7.5, 13.2 and 8.3 respectively.