Knowledge management and Cultural Heritage repositories: Cross-Lingual Information Retrieval strategies

In the last years important initiatives, like the development of the European Library and Europeana, aim to increase the availability of cultural content from various types of providers and institutions. The accessibility to these resources requires the development of environments which allow both to manage multilingual complexity and to preserve the semantic interoperability. The creation of Natural Language Processing (NLP) applications is finalized to the achievement of CrossLingual Information Retrieval (CLIR). This paper presents an ongoing research on language processing based on the LexiconGrammar (LG) approach with the goal of improving knowledge management in the Cultural Heritage repositories. The proposed framework aims to guarantee interoperability between multilingual systems in order to overcome crucial issues like cross-language and cross-collection retrieval. Indeed, the LG methodology tries to overcome the shortcomings of statistical approaches as in Google Translate or Bing by Microsoft concerning Multi-Word Unit (MWU) processing in queries, where the lack of linguistic context represents a serious obstacle to disambiguation. In particular, translations concerning specific domains, as it is has been widely recognized, is unambiguous since the meanings of terms are mono-referential and the type of relation that links a given term to its equivalent in a foreign language is biunivocal, i.e. a one-to-one coupling which causes this relation to be exclusive and reversible. Ontologies are used in CLIR and are considered by several scholars a promising research area to improve the effectiveness of Information Extraction (IE) techniques particularly for technical-domain queries. Therefore, we present a methodological framework which allows to map both the data and the metadata among the language-specific onto

[1]  Annibale Elia,et al.  Lexicon-Grammar, Electronic Dictionaries and Local Grammars of Italian , 2004 .

[2]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[3]  Noam Chomsky,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[4]  Esther Kaufmann Talking to the Semantic Web - Query Interfaces to Ontologies for the Casual User , 2006, International Semantic Web Conference.

[5]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[6]  Max Silberztein,et al.  Dictionnaires électroniques et analyse automatique de textes : le système intex , 1993 .

[7]  Federica Marano,et al.  Exploring formal models of linguistic data structuring. Enhanced solutions for knowledge management systems based on NLP applications , 2012 .

[8]  Enrico Motta,et al.  SemSearch: A Search Engine for the Semantic Web , 2006, EKAW.

[9]  Z. Harris Co-Occurrence and Transformation in Linguistic Structure , 1957 .

[10]  Douglas W. Oard,et al.  Multilingual Information Access , 2010 .

[11]  Arjohn Kampman,et al.  An RDF Query and Transformation Language , 2006, Semantic Web and Peer-to-Peer.

[12]  Abraham Bernstein,et al.  Querix: A Natural Language Interface to Query Ontologies Based on Clarification Dialogs , 2006 .

[13]  Zellig S. Harris Distributional Structure , 1970 .

[14]  Enrico Motta,et al.  Ontology-Driven Question Answering in AquaLog , 2004, NLDB.

[15]  Zellig S. Harris,et al.  Papers in structural and transformational linguistics , 1951 .

[16]  Maurice Gross,et al.  Grammaire transformationnelle du français : syntaxe du verbe , 1968 .

[17]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.

[18]  Maurice Gross,et al.  Méthodes en syntaxe : régime des constructions complétives , 1978 .

[19]  Maurice Gross La construction de dictionnaires électroniques , 1989 .

[20]  Johanna Monti,et al.  Multi-word unit processing in machine translation. Developing and using language resources for multi-word unit processing in machine translation , 2015 .

[21]  Gregory Grefenstette,et al.  Querying across languages: a dictionary-based approach to multilingual information retrieval , 1996, SIGIR '96.

[22]  Philipp Cimiano,et al.  Porting natural language interfaces between domains: an experimental user study with the ORAKEL system , 2007, IUI '07.

[23]  Karen Spärck Jones,et al.  Natural language interfaces to databases , 1990, The Knowledge Engineering Review.