While there has been an explosion of impressive, data-driven AI applications in recent years, machines still largely lack a deeper understanding of the world to answer questions that go beyond information explicitly stated in text, and to explain and discuss those answers. To reach this next generation of AI applications, it is imperative to make faster progress in areas of knowledge, modeling, reasoning, and language. Standardized tests have often been proposed as a driver for such progress, with good reason: Many of the questions require sophisticated understanding of both language and the world, pushing the boundaries of AI, while other questions are easier, supporting incremental progress. In Project Aristo at the Allen Institute for AI, we are working on a specific version of this challenge, namely having the computer pass Elementary School Science and Math exams. Even at this level there is a rich variety of problems and question types, the most difficult requiring significant progress in AI. Here we propose this task as a challenge problem for the community, and are providing supporting datasets. Solutions to many of these problems would have a major impact on the field so we encourage you: Take the Aristo Challenge!
[1]
Oren Etzioni,et al.
Diagram Understanding in Geometry Questions
,
2014,
AAAI.
[2]
Ai Kawazoe,et al.
Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving
,
2014,
LREC.
[3]
Ernest Davis.
The Limitations of Standardized Science Tests as Benchmarks for Artificial Intelligence Research: Position Paper
,
2014,
ArXiv.
[4]
Peter Clark,et al.
A study of the knowledge base requirements for passing an elementary science test
,
2013,
AKBC '13.
[5]
Hector J. Levesque,et al.
The Winograd Schema Challenge
,
2011,
AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
[6]
Eliza Strickland,et al.
Can an AI get into the University of Tokyo
,
2013
.
[7]
Matthew Richardson,et al.
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
,
2013,
EMNLP.