Semantic Document Networks to Support Concept Retrieval

This chapter focuses on a framework to support advanced document storage and fast queries to retrieve documents based on concept-focused searches. These searches favour ‘semantic’ searches which evaluate and use the meanings of words and phrases, rather than ‘key-word’ searches. The framework rests on three stages: pre-processing (semantic analysis influences the storage quality within a semantic database), conceptualization (extraction of key concepts to establish document networks), and storage within a semantic database, facilitating advanced future retrieval. The objective is to decompose documents and extract all relevant information about structure and content to allow comprehensive storage in a semantic document network; including the interpretation according to domains, contexts, languages, or readers. For example, the word ‘trunk’ may refer to a storage area (in the context of motor vehicles), a clothes storage box (in the context of travelling), or an elephant’s appendage (in the context of a safari); see Figure 1. The arrows represent parameters associated with relations. There can be multiple meanings for the related words and it is only the clustering of words that provides the important context which provides readers with meaning; e.g., Safari is also the name of an Internet browser. A brief introduction to conceptualization and the semantic document network provides an overview of how information can be stored in an interlinked network. Using a short sample, we demonstrate the calculation of the semantic core using concept-based indexing and how the concepts are embedded within the existing semantic document network. Simon Boese University of Hamburg, Germany

[1]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[2]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[3]  Dunja Mladenic,et al.  Semantic Knowledge Management , 2009 .

[4]  Naomie Salim,et al.  An improved plagiarism detection scheme based on semantic role labeling , 2012, Appl. Soft Comput..

[5]  Lenhart K. Schubert Semantic Nets are in the Eye of the Beholder , 1991, Principles of Semantic Networks.

[6]  Robert Sawyer,et al.  BI's Impact on Analyses and Decision Making Depends on the Development of Less Complex Applications , 2011, Int. J. Bus. Intell. Res..

[7]  Semire Dikli,et al.  An Overview of Automated Scoring of Essays. , 2006 .

[8]  Henry Huaqing Xu,et al.  Supply chain process modelling for manufacturing systems , 2014 .

[9]  Peter Kieseberg,et al.  Anonymity and Pseudonymity in Data-Driven Science , 2014 .

[10]  Jean-Pierre Koenig,et al.  What with? The Anatomy of a (Proto)-Role , 2007, J. Semant..

[11]  Alessandro Micarelli,et al.  Web Document Modeling , 2007, The Adaptive Web.

[12]  John Wang,et al.  Encyclopedia of Business Analytics and Optimization , 2018 .

[13]  Dariush Nejad Ansari,et al.  Automated Versus Human Essay Scoring: A Comparative Study , 2012 .

[14]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[15]  Michiel van Wezel,et al.  Modeling brand choice using boosted and stacked neural networks , 2005 .

[16]  Habiba Drias,et al.  Social Networks Discovery Based on Information Retrieval Technologies and Bees Swarm Optimization: Application to DBLP , 2014, Int. J. Syst. Serv. Oriented Eng..

[17]  Lokendra Shastri,et al.  Why Semantic Networks? , 1991, Principles of Semantic Networks.

[18]  Nigel Kenneth Pope,et al.  Business Applications and Computational Intelligence , 2006 .