Commonsense Properties fromQuery Logs andQuestion Answering Forums

Commonsense knowledge about object properties, human behavior and general concepts is crucial for robust AI applications. However, automatic acquisition of this knowledge is challenging because of sparseness and bias in online sources. This paper presents Quasimodo, a methodology and tool suite for distilling commonsense properties from non-standard web sources. We devise novel ways of tapping into search-engine query logs and QA forums, and combining the resulting candidate assertions with statistical cues from encyclopedias, books and image tags in a corroboration step. Unlike prior work on commonsense knowledge bases, Quasimodo focuses on salient properties that are typically associated with certain objects or concepts. Extensive evaluations, including extrinsic use-case studies, show that Quasimodo provides better coverage than state-of-the-art baselines with comparable quality.

[1]  John McCarthy,et al.  Programs with common sense , 1960 .

[2]  E A Feigenbaum,et al.  Knowledge Engineering , 1984, Annals of the New York Academy of Sciences.

[3]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[4]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[5]  Emanuele Pianta,et al.  Revising the Wordnet Domains Hierarchy: semantics, coverage and balancing , 2004 .

[6]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[7]  Ricardo A. Baeza-Yates,et al.  Extracting semantic relations from query logs , 2007, KDD '07.

[8]  Benjamin Van Durme,et al.  Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs , 2008, ACL.

[9]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[10]  Gerhard Weikum,et al.  SOFIE: a self-organizing framework for information extraction , 2009, WWW '09.

[11]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[12]  Oren Etzioni,et al.  An analysis of open information extraction based on semantic role labeling , 2011, K-CAP '11.

[13]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[14]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[15]  Catherine Havasi,et al.  ConceptNet 5: A Large Semantic Network for Relational Knowledge , 2013, The People's Web Meets NLP.

[16]  Marius Pasca,et al.  Open-Domain Fine-Grained Class Extraction from Web Search Queries , 2013, EMNLP.

[17]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[18]  Gerhard Weikum,et al.  WebChild: harvesting and organizing commonsense knowledge from the web , 2014, WSDM.

[19]  Fabian M. Suchanek,et al.  Canonicalizing Open Knowledge Bases , 2014, CIKM.

[20]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[21]  Ryen W. White,et al.  Questions vs. Queries in Informational Search Tasks , 2015, WWW.

[22]  Marius Pasca,et al.  The Role of Query Sessions in Interpreting Compound Noun Phrases , 2015, CIKM.

[23]  Ali Farhadi,et al.  Stating the Obvious: Extracting Visual Common Sense Knowledge , 2016, NAACL.

[24]  Harinder Pal,et al.  Demonyms and Compound Relational Nouns in Nominal Open IE , 2016, AKBC@NAACL-HLT.

[25]  M. de Rijke,et al.  A Survey of Query Auto Completion in Information Retrieval , 2016, Found. Trends Inf. Retr..

[26]  Gerhard Weikum,et al.  Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags , 2016, AAAI.

[27]  Gerhard Weikum,et al.  WebChild 2.0 : Fine-Grained Commonsense Knowledge Distillation , 2017, ACL.

[28]  Oren Etzioni,et al.  Moving beyond the Turing Test with the Allen AI Science Challenge , 2016, Commun. ACM.

[29]  Peter Clark,et al.  Domain-Targeted, High Precision Knowledge Extraction , 2017, TACL.

[30]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[31]  Erik Cambria,et al.  Augmenting End-to-End Dialogue Systems With Commonsense Knowledge , 2018, AAAI.

[32]  Bhavana Dalvi,et al.  Reasoning about Actions and State Changes by Injecting Commonsense Knowledge , 2018, EMNLP.

[33]  Mausam,et al.  Open Information Extraction from Conjunctive Sentences , 2018, COLING.

[34]  Kenny Q. Zhu,et al.  Automatic Extraction of Commonsense LocatedNear Knowledge , 2017, ACL.

[35]  Gerhard Weikum,et al.  VISIR: Visual and Semantic Image Label Refinement , 2018, WSDM.

[36]  Yejin Choi,et al.  SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.

[37]  Markus Krötzsch,et al.  Getting the Most Out of Wikidata: Semantic Technology Usage in Wikipedia's Knowledge Graph , 2018, SEMWEB.

[38]  Jonathan Berant,et al.  CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.