Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading

This paper explores the close relationship between question answering and machine reading, and how the active use of reasoning to answer (and in the process, disambiguate) questions can also be applied to reading declarative texts, where a substantial proportion of the text’s contents is already known to (represented in) the system. In question answering, a question may be ambiguous, and it may only be in the process of trying to answer it that the "right" way to disambiguate it becomes apparent. Similarly in machine reading, a text may be ambiguous, and may require some process to relate it to what is already known. Our conjecture in this paper is that these two processes are similar, and that we can modify a question answering tool to help "read" new text that augments existing system knowledge. Specifically, interpreting a new text T can be recast as trying to answer, or partially answer, the question "Is it true that T?", resulting in both appropriate disambiguation and connection of T to existing knowledge. Some preliminary investigation suggests this might be useful for proposing knowledge base extensions, extracted from text, to a knowledge engineer.

[1]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[2]  C. Raymond Perrault,et al.  Analyzing Intention in Utterances , 1986, Artif. Intell..

[3]  P. Winston Learning by Augmenting Rules and Accumulating Censors. , 1982 .

[4]  Patrick Henry Winston,et al.  Learning New Principles from Precedents and Exercises , 1982, Artif. Intell..

[5]  Dedre Gentner,et al.  Structure-Mapping: A Theoretical Framework for Analogy , 1983, Cogn. Sci..

[6]  Mark T. Keane,et al.  The Incremental Analogy Machine: A Computational Model of Analogy , 1988, EWSL.

[7]  Brian Falkenhainer,et al.  The Structure-Mapping Engine: Algorithm and Examples , 1989, Artif. Intell..

[8]  Michael C. McCord,et al.  Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars , 1989, Natural Language and Logic.

[9]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[10]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[11]  P. Resnik Selection and information: a class-based approach to lexical relationships , 1993 .

[12]  T. Trabasso,et al.  Constructing inferences during narrative text comprehension. , 1994, Psychological review.

[13]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[14]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[15]  Luc De Raedt,et al.  Multiple Predicate Learning in Two Inductive Logic Programming Settings , 1996, Log. J. IGPL.

[16]  Oren Etzioni,et al.  A scalable comparison-shopping agent for the World-Wide Web , 1997, AGENTS '97.

[17]  Nicholas Kushmerick,et al.  Wrapper Induction for Information Extraction , 1997, IJCAI.

[18]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[19]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[20]  David R. Traum,et al.  20 Questions on Dialogue Act Taxonomies , 2000, J. Semant..

[21]  William W. Cohen WHIRL: A word-based information representation language , 2000, Artif. Intell..

[22]  Diana McCarthy,et al.  Word Sense Disambiguation Using Automatically Acquired Verbal Preferences , 2000, Comput. Humanit..

[23]  Kenneth D. Forbus,et al.  Dynamic Case Creation and Expansion for Analogical Reasoning , 2000, AAAI/IAAI.

[24]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[25]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[26]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[27]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[28]  Brian F. Bowdle,et al.  Metaphor is like analogy , 2001 .

[29]  Pedro M. Domingos,et al.  Learning to map between structured representations of data , 2002 .

[30]  Kenneth D. Forbus,et al.  An analogy ontology for integrating analogical processing and first-principles reasoning , 2002, AAAI/IAAI.

[31]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[32]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[33]  Joost N. Kok,et al.  Efficient Frequent Query Discovery in FARMER , 2003, PKDD.

[34]  Matthew Richardson,et al.  Building large knowledge bases by mass collaboration , 2003, K-CAP '03.

[35]  Lenhart K. Schubert,et al.  Extracting and evaluating general world knowledge from the Brown Corpus , 2003, HLT-NAACL 2003.

[36]  Bradley C. Love,et al.  CAB: Connectionist Analogy Builder , 2003, Cogn. Sci..

[37]  Martha W. Evens,et al.  The Use of Analogies in Human Tutoring Dialogues , 2003 .

[38]  Martha Palmer,et al.  PropBank: the Next Level of TreeBank , 2003 .

[39]  K. Holyoak,et al.  A symbolic-connectionist theory of relational inference and generalization. , 2003, Psychological review.

[40]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[41]  Joel A. Michael,et al.  Implementing Analogies in an Electronic Tutoring System , 2004, Intelligent Tutoring Systems.

[42]  Stephen Soderland,et al.  Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.

[43]  Ralph Grishman,et al.  Discovering Relations among Named Entities from Large Corpora , 2004, ACL.

[44]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[45]  Sanda M. Harabagiu,et al.  Shallow Semantics for Relation Extraction , 2005, IJCAI.

[46]  Doug Downey,et al.  A Probabilistic Model of Redundancy in Information Extraction , 2005, IJCAI.

[47]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[48]  A. Halevy,et al.  Using known schemas and mappings to construct new semantic mappings , 2005 .

[49]  Dong-Hong Ji,et al.  Unsupervised Feature Selection for Relation Extraction , 2005, IJCNLP.

[50]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[51]  Peter Clark,et al.  Acquiring and Using World Knowledge Using a Restricted Subset of English , 2005, FLAIRS Conference.

[52]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[53]  Pedro M. Domingos,et al.  Memory-Efficient Inference in Relational Domains , 2006, AAAI.

[54]  Satoshi Sekine,et al.  Preemptive Information Extraction using Unrestricted Relation Discovery , 2006, NAACL.

[55]  Eugene Charniak,et al.  Reranking and Self-Training for Parser Adaptation , 2006, ACL.

[56]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[57]  Kentaro Torisawa,et al.  Exploiting Wikipedia as External Knowledge for Named Entity Recognition , 2007, EMNLP.

[58]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[59]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[60]  David A. McAllester,et al.  The Generalized A* Architecture , 2007, J. Artif. Intell. Res..

[61]  Jerry R. Hobbs,et al.  Learning by Reading: A Prototype System, Performance Baseline and Lessons Learned , 2007, AAAI.

[62]  Foster J. Provost,et al.  Handling Missing Values when Applying Classification Models , 2007, J. Mach. Learn. Res..

[63]  Oren Etzioni,et al.  Strategies for lifelong knowledge extraction from the web , 2007, K-CAP '07.

[64]  Pedro M. Domingos,et al.  Joint Inference in Information Extraction , 2007, AAAI.

[65]  Oren Etzioni,et al.  Unsupervised Resolution of Objects and Relations on the Web , 2007, NAACL.

[66]  Ari Rappoport,et al.  Fully Unsupervised Discovery of Concept-Specific Relationships by Web Mining , 2007, ACL.

[67]  Mitsuru Ishizuka,et al.  Subtree Mining for Relation Extraction from Wikipedia , 2007, NAACL.

[68]  Doug Downey,et al.  Sparse Information Extraction: Unsupervised Language Models to the Rescue , 2007, ACL.

[69]  Leo C. Ureel,et al.  Integrating Natural Language, Knowledge Representation and Reasoning, and Analogical Processing to Learn by Reading , 2007, AAAI.

[70]  Robert P. Cook,et al.  Freebase: A Shared Database of Structured General Human Knowledge , 2007, AAAI.

[71]  Doug Downey,et al.  Locating Complex Named Entities in Web Text , 2007, IJCAI.

[72]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[73]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[74]  Daniel S. Weld,et al.  Automatically refining the wikipedia infobox ontology , 2008, WWW.

[75]  Lenhart K. Schubert,et al.  Open Knowledge Extraction through Compositional Language Processing , 2008, STEP.

[76]  Ido Dagan,et al.  The Fourth PASCAL Recognizing Textual Entailment Challenge , 2008, TAC.

[77]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[78]  Richard Johansson,et al.  The Effect of Syntactic Representation on Semantic Role Labeling , 2008, COLING.

[79]  Oren Etzioni,et al.  The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[80]  Doug Downey,et al.  It’s a Contradiction – no, it’s not: A Case Study using Functional Relations , 2008, EMNLP.

[81]  Christopher D. Manning,et al.  A Global Joint Model for Semantic Role Labeling , 2008, CL.

[82]  Pedro M. Domingos,et al.  Extracting Semantic Networks from Text Via Relational Clustering , 2008, ECML/PKDD.

[83]  Pedro M. Domingos,et al.  A General Method for Reducing the Complexity of Relational Inference and its Application to MCMC , 2008, AAAI.

[84]  Reid Swanson,et al.  Say Anything: A Massively Collaborative Open Domain Story Writing Companion , 2008, ICIDS.

[85]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[86]  Daniel S. Weld,et al.  Information extraction from Wikipedia: moving down the long tail , 2008, KDD.

[87]  Oren Etzioni,et al.  Scaling Textual Inference to the Web , 2008, EMNLP.

[88]  Roberto Basili,et al.  Tree Kernels for Semantic Role Labeling , 2008, CL.

[89]  Pedro M. Domingos,et al.  Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[90]  Reid Swanson,et al.  Envisioning With Weblogs , 2008 .

[91]  Alon Y. Halevy,et al.  Data integration with uncertainty , 2007, The VLDB Journal.

[92]  Peter Clark,et al.  Boeing’s NLP System and the Challenges of Semantic Representation , 2008, STEP.

[93]  Pedro M. Domingos,et al.  Markov Logic: An Interface Layer for Artificial Intelligence , 2009, Markov Logic: An Interface Layer for Artificial Intelligence.

[94]  Simone Paolo Ponzetto,et al.  Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia , 2009, IJCAI.

[95]  Daniel S. Weld,et al.  Using Wikipedia to bootstrap open information extraction , 2009, SGMD.

[96]  Lenhart K. Schubert,et al.  Weblogs as a source for extracting general world knowledge , 2009, K-CAP '09.

[97]  Kenneth D. Forbus,et al.  EA NLU: Practical Language Understanding for Cognitive Modeling , 2009, FLAIRS.

[98]  A. Akbik,et al.  Wanderlust : Extracting Semantic Relations from Natural Language Text Using Dependency Grammar Patterns , 2009 .

[99]  Kenneth D. Forbus,et al.  Steps Towards a 2 nd Generation Learning by Reading System , 2009 .

[100]  Jayant Madhavan,et al.  Web-scale extraction of structured data , 2009, SGMD.

[101]  Kenneth D. Forbus,et al.  Multimodal knowledge capture from text and diagrams , 2009, K-CAP '09.

[102]  Hoifung Poon,et al.  Unsupervised Semantic Parsing , 2009, EMNLP.

[103]  Kenneth D. Forbus,et al.  Companion Cognitive Systems: Design Goals and Lessons Learned So Far , 2009, IEEE Intelligent Systems.

[104]  Anna Maria Di Sciullo,et al.  Natural Language Understanding , 2009, SoMeT.

[105]  Naoaki Okazaki,et al.  Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web , 2009, ACL.

[106]  Tom M. Mitchell,et al.  Coupling Semi-Supervised Learning of Categories and Relations , 2009, HLT-NAACL 2009.

[107]  Alessandro Moschitti,et al.  Shallow Semantic Parsing for Spoken Language Understanding , 2009, NAACL.

[108]  Vincent Ng,et al.  Semi-Supervised Cause Identification from Aviation Safety Reports , 2009, ACL.

[109]  Oren Etzioni,et al.  Identifying interesting assertions from the web , 2009, CIKM.

[110]  Slav Petrov,et al.  Coarse-to-Fine Natural Language Processing , 2011, Theory and Applications of Natural Language Processing.

[111]  Kenji Sagae,et al.  Analysis of Discourse Structure with Syntactic Dependencies and Data-Driven Shift-Reduce Parsing , 2009, IWPT.

[112]  James Fogarty,et al.  Amplifying community content creation with mixed initiative information extraction , 2009, CHI.

[113]  R. Swanson,et al.  Identifying Personal Stories in Millions of Weblog Entries , 2009, ICWSM 2009.

[114]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[115]  Tim Finin,et al.  Exploiting a Web of Semantic Data for Interpreting Tables , 2010 .

[116]  Oren Etzioni,et al.  Adapting Open Information Extraction to Domain-Specific Relations , 2010, AI Mag..

[117]  Oren Etzioni,et al.  A Latent Dirichlet Allocation Method for Selectional Preferences , 2010, ACL.

[118]  Daniel S. Weld,et al.  Temporal Information Extraction , 2010, AAAI.

[119]  Oren Etzioni,et al.  Learning First-Order Horn Clauses from Web Text , 2010, EMNLP.

[120]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[121]  Doug Downey,et al.  Analysis of a probabilistic model of redundancy in unsupervised information extraction , 2010, Artif. Intell..

[122]  Daniel S. Weld,et al.  Learning 5000 Relational Extractors , 2010, ACL.

[123]  Oren Etzioni,et al.  Semantic Role Labeling for Open Information Extraction , 2010, HLT-NAACL 2010.

[124]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.