Gate Mímir: Answering Questions Google Can't

Free text makes up a large proportion of the vast amounts of information generated by modern society, and search engines such as Google are exceptionally good at finding, indexing and searching this. However, the rise of the Semantic Web and the publishing of increasingly large amounts of structured and interlinked data now means that useful information is distributed across multiple sources and in a variety of formats, which cannot be easily reconciled by these search engines as it is not amenable to free text search. Hence, questions which we may wish to ask of society’s collective knowledge cannot be easily answered. For example, it is difficult to see how traditional search engines could be used to locate documents in which a person born in Sheffield is being quoted. In this paper, we describe GATE Mı́mir which indexes not only free text, but also semantic annotations and knowledge base data. The resulting multi-paradigm index allows us to search across multiple information sources in order to answer questions which are either infeasible or impossible to answer using current web search engines.

[1]  Valentin Tablan,et al.  Information Extraction and Semantic Annotation for Multi-Paradigm Information Management , 2011, Current Challenges in Patent Information Retrieval.