Towards question answering on statistical linked data

As an increasing amount of statistical data is published as linked data, intuitive ways of satisfying information needs and getting new insights out of the data become more and more important. Question answering systems provide such an intuitive interface by translating natural language queries into SPARQL, which is the native query language of RDF knowledge bases. Statistical data, however, is structurally very different from other data and cannot be queried using existing approaches. We analyze the particularities of statistical data represented in the RDF Data Cube Vocabulary in relation to question answering and sketch a new question answering algorithm on statistical data. In order to estimate typical user questions, a statistical question corpus is compiled and its elements are categorized.