SODA: A Service Oriented Data Acquisition Framework

In this chapter, the authors present Service Oriented Data Acquisition (SODA), a service-deployable open-source platform for retrieving and dynamically aggregating information extraction and knowledge acquisition software components. The motivation in creating such a system came from the observed gap between the large availability of Information Analysis components for different frameworks (such as UIMA [Ferrucci & Lally, 2004] and GATE [Cunningham, Maynard, Bontcheva, & Tablan, 2002]) and the difficulties in discovering, retrieving, integrating these components, and embedding them into software systems for knowledge feeding. By analyzing the research area, the authors noticed that there are a few solutions for this problem, though they all lack in assuring a great level of platform independence, collaboration, flexibility, and most of all, openness. The solution that they propose is targeted to different kinds of users, from application developers, benefiting from a semantic repository of inter-connectable information extraction and ontology feeding components, to final users, who can plug and play these components through SODA compliant clients. DOI: 10.4018/978-1-4666-0188-8.ch003

[1]  Asunción Gómez-Pérez,et al.  Localizing Ontologies in OWL , 2007 .

[2]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[3]  Henrik Eriksson,et al.  The evolution of Protégé: an environment for knowledge-based systems development , 2003, Int. J. Hum. Comput. Stud..

[4]  René Witte,et al.  Semantic Assistants - User-Centric Natural Language Processing Services for Desktop Clients , 2008, ASWC.

[5]  Roberto Basili,et al.  Integrating ontological and linguistic knowledge for conceptual information extraction , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[6]  Malcolm James Beynon,et al.  Re-Sampling Based Data Mining Using Rough Set Theory , 2007 .

[7]  Hamish Cunningham,et al.  GATE-a General Architecture for Text Engineering , 1996, COLING.

[8]  Maria Teresa Pazienza,et al.  Linguistic Watermark 3.0: An RDF Framework and a Software Library for Bridging Language and Ontologies in the Semantic Web , 2008, SWAP.

[9]  Tim Beardsley Tool Time on Cactus Hill , 1998 .

[10]  Paul Buitelaar,et al.  LingInfo: Design and Applications of a Model for the Integration of Linguistic Information in Ontologies , 2006 .

[11]  Hussein A. Abbass,et al.  Heuristics and optimization for knowledge discovery , 2002 .

[12]  Aurora Pérez,et al.  Cooperation Between Expert Knowledge and Data Mining Discovered Knowledge , 2011 .

[13]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[14]  Farid Cerbah,et al.  A Service Oriented Architecture for Adaptable Terminology Acquisition , 2007, NLDB.

[15]  Maria Teresa Pazienza,et al.  A Suite of Semantic Web Tools Supporting Development of Multilingual Ontologies , 2010, Intelligent Information Access.

[16]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[17]  Maria Teresa Pazienza,et al.  Computer-aided Ontology Development: an integrated environment , 2010 .

[18]  Donna K. Harman,et al.  The DARPA TIPSTER project , 1992, SIGF.

[19]  A. V. Senthil Kumar,et al.  Knowledge Discovery Practices and Emerging Applications of Data Mining: Trends and New Domains , 2010 .

[20]  Paul Buitelaar,et al.  A Protégé Plug-In for Ontology Extraction from Text Based on Linguistic Analysis , 2004, ESWS.

[21]  Maria Teresa Pazienza,et al.  Semantic Turkey : A Semantic Bookmarking Tool (System Description) , 2007, ESWC.

[22]  Mark D. Wilkinson,et al.  SADI Semantic Web Services - ‚cause you can't always GET what you want! , 2009, 2009 IEEE Asia-Pacific Services Computing Conference (APSCC).

[23]  Maria Teresa Pazienza,et al.  Din din! The (Semantic) Turkey is served! , 2008, SWAP.

[24]  A MusenMark,et al.  The evolution of Protgé , 2003 .

[25]  Peter D. Karp,et al.  OKBC: A Programmatic Foundation for Knowledge Base Interoperability , 1998, AAAI/IAAI.

[26]  Paolo Bouquet,et al.  An Entity Name System (ENS) for the Semantic Web , 2008, ESWC.

[27]  Christian Chiarcos,et al.  Ontology-Based Interface Specifications for a NLP Pipeline Architecture , 2008, LREC.

[28]  Adil M. Bagirov,et al.  A Heuristic Algorithm for Feature Selection Based on Optimization Techniques , 2002 .

[29]  Bob Carpenter,et al.  The logic of typed feature structures , 1992 .

[30]  Paul Buitelaar,et al.  LexOnto: A Model for Ontology Lexicons for Ontology-based NLP , 2007 .

[31]  James A. Hendler,et al.  The Semantic Web: A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities , 2001 .

[32]  Bob Carpenter Logic of Typed Feature Structures, The (Cambridge Tracts in Theoretical Computer Science) , 2005 .