Technology for local textual inference is central to producing a next generation of intelligent yet robust human language processing systems. One can think of it as Information Retrieval++. It is needed for a search on male fertility may be affected by use of cell phones to match a document saying Startling new research into mobile phones suggests they can reduce a man’s sperm count up to 30% , despite the fact that the only word overlap is phones. But textual inference is useful more broadly. It is an enabling technology for applications of document interpretation, such as customer response management, where one would like to conclude from the message My Squeezebox regularly skips during music playback that Sender has set up Squeezebox and Sender can hear music through Squeezebox , and information extraction, where from the text Jorma Ollila joined Nokia in 1985 and held a variety of key management positions before taking the helm in 1992 , one wants to extract that Jorma Ollila has served as the CEO of Nokia , a relation that might be more formally denoted as role(CEO, Nokia, Jorma Ollila). Textual inference is a difficult problem (as the results from early evaluations have shown): current systems do statistically better than random guessing, but not by very much. Nevertheless, it is also an area where there is promising developing technology and a good deal of natural language community interest. In other words, it is an ideal research problem. To further this research agenda, data sets have been constructed to assess textual inference systems. This paper examines how the task of textual inference has been and should be defined and discusses what kind of evaluation data is appropriate for the task.
[1]
Malvina Nissim,et al.
A System for Identifying Named Entities in Biomedical Text: how Results From two Evaluations Reflect on Both the System and the Evaluations
,
2005,
Comparative and functional genomics.
[2]
Z. Szabó.
Semantics versus Pragmatics
,
2005
.
[3]
Umberto Eco.
Dictionary vs. Encyclopedia
,
1984
.
[4]
Sanda M. Harabagiu,et al.
COGEX: A Logic Prover for Question Answering
,
2003,
NAACL.
[5]
Ramanathan V. Guha,et al.
Building large knowledge-based systems
,
1989
.
[6]
Jean Carletta,et al.
Assessing Agreement on Classification Tasks: The Kappa Statistic
,
1996,
CL.
[7]
J. Michael Dunn,et al.
Relevance Logic and Entailment
,
1986
.
[8]
K. Bach.
The Myth of Conventional Implicature
,
1999
.
[9]
Andrew Y. Ng,et al.
Robust Textual Inference via Graph Matching
,
2005,
HLT.
[10]
Laurence R. Horn.
The Said and the Unsaid
,
1992
.
[11]
Sanda M. Harabagiu,et al.
Experiments with Open-Domain Textual Question Answering
,
2000,
COLING.
[12]
Ido Dagan,et al.
The Third PASCAL Recognizing Textual Entailment Challenge
,
2007,
ACL-PASCAL@ACL.
[13]
Chris Quirk,et al.
Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources
,
2004,
COLING.