Textual Inference: getting logic from humans

This paper describes a manual investigation of the SICK corpus, which is the proposed testing set for a new system for natural language inference. The system provides conceptual semantics for sentences, so that entailment-contradiction-neutrality relations between sentences can be identified. The investigation of the SICK corpus was a necessary task to check the quality of the testing data which is to be used as a golden standard for the new system. This checking also provides crucial insights for the implementation of the components of the system. The investigation showed that the human judgements used in the building of the SICK corpus can be erroneous, in this way deteriorating the quality of an otherwise useful resource. We also show that detecting the relationship between some pairs of the SICK corpus requires more than just lexical semantics, which provides us with guidelines and intuitions for our further implementation.