Conceptual Modeling Foundations for a Web of Knowledge

The semantic web purports to be a web of knowledge that can answer our questions, help us reason about everyday problems as well as scientific endeavors, and service many of our wants and needs. Researchers and others expound various views about exactly what this means. Here we propose an answer with conceptual modeling as its foundation. We define a web of knowledge as a collection of interconnected knowledge bundles superimposed over a web of documents. Knowledge bundles are conceptual model instances augmented with facilities that provide for both extensional and intensional facts, for linking between knowledge bundles yielding a web of data, and for linking to an underlying document collection providing a means of authentication. We formally define both the component parts of these augmented conceptual models and their synergistic interconnections. As for practicalities, we discuss problems regarding the potentially high cost of constructing a web of knowledge and explain how they may be mitigated. We also discuss usage issues and show how untrained users can interact with and gain benefit from a web of knowledge.

[1]  Zonghui Lian,et al.  A Tool to Support Ontology Creation Based on Incremental Mini-Ontology Merging , 2008 .

[2]  York Sure-Vetter,et al.  Transforming arbitrary tables into logical form with TARTAR , 2007, Data Knowl. Eng..

[3]  Paul Buitelaar,et al.  Towards Linguistically Grounded Ontologies , 2009, ESWC.

[4]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[5]  Cui Tao,et al.  Automatically Extracting Ontologically Specified Data from HTML Tables of Unknown Structure , 2002, ER.

[6]  Christopher Ré,et al.  Large-Scale Deduplication with Constraints Using Dedupalog , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[7]  Sunita Sarawagi,et al.  Information Extraction , 2008 .

[8]  David W. Embley,et al.  Towards Ontology Generation from Tables , 2005, World Wide Web.

[9]  Tim Berners-Lee,et al.  Weaving The Web: The Original Design And Ultimate Destiny of the World Wide Web , 1999 .

[10]  Cui Tao,et al.  FOCIH: Form-Based Ontology Creation and Information Harvesting , 2009, ER.

[11]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[12]  Xindong Wu EIC Editorial: State of the Transactions , 2006, IEEE Trans. Knowl. Data Eng..

[13]  Michael J. Cafarella Extracting and Querying a Comprehensive Web Database , 2009, CIDR.

[14]  Werner Nutt,et al.  Basic Description Logics , 2003, Description Logic Handbook.

[15]  David W. Embley,et al.  Semantically Conceptualizing and Annotating Tables , 2008, ASWC.

[16]  Andrea Calì,et al.  Tractable Query Answering over Ontologies with Datalog+/- , 2009, Description Logics.

[17]  David W. Embley Programming with data frames for everyday data items , 1980, AFIPS '80.

[18]  Andrea Calì,et al.  Tractable Query Answering over Conceptual Schemata , 2009, ER.

[19]  G. Nagy,et al.  Interactive Conversion of Large Web Tables , 2009 .

[20]  Claire Grover,et al.  Named Entity Recognition for Digitised Historical Texts , 2008, LREC.

[21]  L. Floridi Blackwell Guide to the Philosophy of Computing and Information , 2003 .

[22]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[23]  Vannevar Bush,et al.  As we may think , 1945, INTR.

[24]  Raghu Ramakrishnan,et al.  Toward best-effort information extraction , 2008, SIGMOD Conference.

[25]  Joachim Biskup,et al.  Extracting information from heterogeneous information sources using ontologically specified target views , 2003, Inf. Syst..

[26]  Cui Tao,et al.  Ontology generation, information harvesting and semantic annotation for machine-generated web pages , 2009 .

[27]  J. Cordy,et al.  A Survey of Table Recognition : Models , Observations , Transformations , and Inferences , 2003 .

[28]  Joann J. Ordille,et al.  Data integration: the teenage years , 2006, VLDB.

[29]  David W. Embley,et al.  Automatic direct and indirect schema mapping: experiences and lessons learned , 2004, SGMD.

[30]  Alon Y. Halevy,et al.  Bootstrapping pay-as-you-go data integration systems , 2008, SIGMOD Conference.

[31]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[32]  David W. Embley,et al.  Atribute Match Discovery in Information Integration: Exploiting Multiple Facets of Metadata , 2002, J. Braz. Comput. Soc..

[33]  George Nagy,et al.  Wang Notation Tool: Layout independent representation of tables , 2008, 2008 19th International Conference on Pattern Recognition.

[34]  Cui Tao,et al.  Automatic hidden-web table interpretation, conceptualization, and semantic annotation , 2009, Data Knowl. Eng..

[35]  Luís Torgo,et al.  Design of an end-to-end method to extract information from tables , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[36]  Richard Zanibbi,et al.  A survey of table recognition: Models , 2004 .

[37]  N. Guarino,et al.  Formal Ontology in Information Systems: Proceedings of the 1st International Conference June 6-8, 1998, Trento, Italy , 1998 .

[38]  Philip A. Bernstein,et al.  Industrial-strength schema matching , 2004, SGMD.

[39]  Xinxin Wang,et al.  Tabular Abstraction, Editing, and Formatting , 1996 .

[40]  Cui Tao,et al.  KBB: A Knowledge-Bundle Builder for Research Studies , 2010, ER Workshops.

[41]  Wolfgang Gatterbauer,et al.  Towards domain-independent information extraction from web tables , 2007, WWW '07.

[42]  Peter H. Aiken,et al.  Reverse Engineering of Data , 1998, IBM Syst. J..

[43]  David W. Embley,et al.  Ontology-Based Constraint Recognition for Free-Form Service Requests , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[44]  Mark S. Vickers,et al.  Ontology-Based Free-Form Query Processing for the Semantic Web , 2006 .

[45]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[46]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[47]  Margaret-Anne D. Storey,et al.  Ontology Mapping - a User Survey , 2007, OM.

[48]  Steffen Staab,et al.  Gimme' the context: context-driven automatic semantic annotation with C-PANKOW , 2005, WWW '05.

[49]  Weiguo Fan,et al.  Beyond keywords: Automated question answering on the web , 2008, CACM.

[50]  Riccardo Rosati,et al.  On the decidability and complexity of integrating ontologies and rules , 2005, J. Web Semant..

[51]  Gerhard Weikum,et al.  Database and information-retrieval methods for knowledge discovery , 2009, CACM.

[52]  Mark A. Bedau,et al.  Blackwell Guide to the Philosophy of Computing and Information , 2003 .

[53]  Philipp Cimiano,et al.  Ontology learning and population from text - algorithms, evaluation and applications , 2006 .

[54]  David W. Embley,et al.  Object-oriented systems analysis - a model-driven approach , 1991, Yourdon Press Computing series.

[55]  David W. Embley,et al.  A composite approach to automating direct and indirect schema mappings , 2006, Inf. Syst..

[56]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[57]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[58]  David W. Embley,et al.  Multifaceted Exploitation of Metadata for Attribute Match Discovery in Information Integration , 2001, Workshop on Information Integration on the Web.

[59]  Jian Pei,et al.  Can we learn a template-independent wrapper for news article extraction from a single training site? , 2009, KDD.

[60]  George Nagy,et al.  From Tessellations to Table Interpretation , 2009, Calculemus/MKM.

[61]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[62]  Zhiyong Lu,et al.  OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression , 2008, BMC Bioinformatics.

[63]  David W. Embley,et al.  Table-processing paradigms: a research survey , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[64]  Cui Tao,et al.  A Conceptual-Model-Based Computational Alembic for a Web of Knowledge , 2008, ER.

[65]  Dean Allemang,et al.  Semantic Web for the Working Ontologist - Effective Modeling in RDFS and OWL, Second Edition , 2011 .

[66]  Fuad Rahman,et al.  Special issue on detection and understanding of tables and forms for document processing applications , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[67]  Elena Console,et al.  Data Fusion , 2009, Encyclopedia of Database Systems.

[68]  David W. Embley,et al.  Conceptual xml for systems analysis , 2007 .

[69]  David W. Embley,et al.  Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages , 1999, Data Knowl. Eng..