Efficient Reasoning with Large Knowledge Bases

Abstract : This report results from a contract tasking Manchester Informatics Ltd as follows: The grantee will investigate relevance determination for reasoning on systems that contain more than 100,000 first-order axioms. Relevance determination refers to techniques for examining very large knowledge bases to distinguish between relevant, possibly relevant, and not relevant information. The best existing approaches (from other researchers) are unable to cope with knowledge bases of 10,000 axioms. He will investigate these techniques in two phases. First, at six-months he will deliver an extension to his existing system, Vampire, capable of resolving queries within seconds on knowledge bases over 30,000 axioms. By the conclusion of the research, he will improve his relevance filtering techniques to enable Vampire to reason on knowledge bases with over 100,000 axioms within seconds. Complete details described in the attached technical proposal. We tested the new strategy again to find out inconsistencies in SUMO 1.72 with row variables expanded to sequences of the length 50 (that is, a knowledge base with about 30,000 first-order axioms). When we used the negation of an axiom causing inconsistency as the query, inconsistency was always proved in less than one second. We believe query answering can be done much faster in less than 0.1 second. Our experiments discovered the following problem. When a knowledge base contains many similar atoms (e.g., ground facts with the instance predicate) just passing the knowledge base to Vampire's kernel may take over a second. After profiling, we have found out that the time is essentially spent not on query answering at all but on building some indexes. Indexes in Vampire were not designed with the aim of handling large signatures and should be reimplemented for experiments with anthologies. Moreover, we think that indexes should be pre-compiled rather than built by the kernel. However, this is a subject for a future research.