Corpus-Based Knowledge Representation

A corpus-based knowledge representation system consists of a large collection of disparate knowledge fragments or schemas, and a rich set of statistics computed over the corpus. We argue that by collecting such a corpus and computing the appropriate statistics, corpus-based representation offers an alternative to traditional knowledge representation for a broad class of applications. The key advantage of corpus-based representation is that we avoid the laborious process of building a (often brittle) knowledge base. We describe the basic building blocks of a corpus-based representation system and a set of applications for which such a paradigm is appropriate, including one application where the approach is already showing promising results.

[1]  Oren Etzioni,et al.  Towards a theory of natural language interfaces to databases , 2003, IUI.

[2]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[3]  Harris Wu,et al.  Probabilistic question answering on the web , 2002, WWW '02.

[4]  Neoklis Polyzotis,et al.  Statistical synopses for graph-structured XML databases , 2002, SIGMOD '02.

[5]  Matthew Richardson,et al.  Building large knowledge bases by mass collaboration , 2003, K-CAP '03.

[6]  Peter D. Karp,et al.  OKBC: A Programmatic Foundation for Knowledge Base Interoperability , 1998, AAAI/IAAI.

[7]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[8]  P. Pandurang Nayak,et al.  Representing Multiple Theories , 1994, AAAI.

[9]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[10]  Philip A. Bernstein,et al.  Applying Model Management to Classical Meta Data Problems , 2003, CIDR.

[11]  R. Guha Contexts: a formalization and some applications , 1992 .

[12]  斉藤 康己,et al.  Douglas B. Lenat and R. V. Guha : Building Large Knowledge-Based Systems, Representation and Inference in the Cyc Project, Addison-Wesley (1990). , 1990 .

[13]  Gerhard Weikum,et al.  The BINGO! System for Information Portal Generation and Expert Web Search , 2003, CIDR.

[14]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[15]  Oren Etzioni,et al.  Crossing the Structure Chasm , 2003, CIDR.

[16]  Erik T. Mueller,et al.  Open Mind Common Sense: Knowledge Acquisition from the General Public , 2002, OTM.

[17]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[18]  Dan Suciu,et al.  Schema mediation in peer data management systems , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[19]  Alon Y. Halevy,et al.  The Nimble XML data integration system , 2001, Proceedings 17th International Conference on Data Engineering.

[20]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[21]  Daniel S. Weld,et al.  The nimble integra-tion system , 2001, SIGMOD 2001.

[22]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[23]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[24]  Pedro M. Domingos,et al.  Representing and reasoning about mappings between domain models , 2002, AAAI/IAAI.

[25]  Mark A. Musen,et al.  Promptdiff: a fixed-point algorithm for comparing ontology versions , 2002, AAAI/IAAI.

[26]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[27]  Fausto Giunchiglia,et al.  Local Models Semantics, or Contextual Reasoning = Locality + Compatibility , 1998, KR.

[28]  Beng Chin Ooi,et al.  An adaptive peer-to-peer network for distributed caching of OLAP results , 2002, SIGMOD '02.

[29]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.