A linked open data architecture for the historical archives of the Getulio Vargas Foundation

This paper presents an architecture for historical archives maintenance based on Open Linked Data technologies and open source distributed development model and tools. The proposed architecture is being implemented for the archives of the Centro de Pesquisa e Documentação de História Contemporânea do Brasil (Center for Research and Documentation of Brazilian Contemporary History) of the Fundação Getulio Vargas (Getulio Vargas Foundation). We discuss the benefits of this initiative and suggest ways of implementing it, as well as describing the preliminary milestones already achieved. We also present some of the possibilities for extending the accessibility and usefulness of the data archives information using semantic web technologies, natural language processing, image analysis tools, and audio–textual alignment, both in progress and planned.

[1]  Lluís Padró,et al.  FreeLing 3.0: Towards Wider Multilinguality , 2012, LREC.

[2]  Paulo Cezar Pinto Carvalho,et al.  Structuring and Embedding Image Captions: the V.I.F. Multi-modal System , 2012, VAST.

[3]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[4]  Michael K. Bergman White Paper: The Deep Web: Surfacing Hidden Value , 2001 .

[5]  Eero Hyvönen,et al.  MuseumFinland - Finnish museums on the semantic web , 2005, J. Web Semant..

[6]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[7]  Gerard de Melo,et al.  Embedding NomLex-BR nominalizations into OpenWordnet-PT , 2014, GWC.

[8]  Antoine Isaac,et al.  data.europeana.eu: The Europeana Linked Open Data Pilot , 2011, Dublin Core Conference.

[9]  Ralph Grishman,et al.  NOMLEX: a lexicon of nominalizations , 1998 .

[10]  Christian Bizer,et al.  D2R Server - Publishing Relational Databases on the Semantic Web , 2004 .

[11]  Antoine Isaac,et al.  SKOS Simple Knowledge Organization System Primer , 2009 .

[12]  Jon Purday Think culture: Europeana.eu from concept to construction , 2009, Electron. Libr..

[13]  Mirina Grosz,et al.  World Wide Web Consortium , 2010 .

[14]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[15]  Antske Fokkens,et al.  BiographyNet: Methodological Issues when NLP supports historical research , 2014, LREC.

[16]  Shigeo Sugimoto,et al.  Dublin Core Metadata Element Set , 1999 .

[17]  Alzira Alves de Abreu,et al.  Dicionário histórico-biográfico brasileiro, pós-1930 , 2001 .

[18]  Adam Pease,et al.  Towards a standard upper ontology , 2001, FOIS.

[19]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[20]  Dário Augusto Borges Oliveira,et al.  A Linked Open Data Architecture for Contemporary Historical Archives , 2013, SDA.

[21]  Wang Jun Open Archives Initiative Protocol for Metadata Harvesting , 2005 .

[22]  John A. Kunze,et al.  The Dublin Core Metadata Element Set , 2007, RFC.

[23]  Isabel Trancoso,et al.  Free tools and resources for Brazilian Portuguese speech recognition , 2011, Journal of the Brazilian Computer Society.

[24]  Gerard de Melo,et al.  OpenWordNet-PT: An Open Brazilian Wordnet for Reasoning , 2012, COLING.

[25]  Eero Hyvönen,et al.  Finnish Museums on the Semantic Web , 2003, WWW.

[26]  Carl Lagoze,et al.  The Open Archives Initiative Protocol for Metadata Harvesting Protocol , 2002 .

[27]  Yolanda Gil,et al.  PROV Model Primer , 2012 .

[28]  Alexandre Rademaker,et al.  Ontology and Context , 2008, 2008 Sixth Annual IEEE International Conference on Pervasive Computing and Communications (PerCom).

[29]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[30]  Francis Bond,et al.  A Survey of WordNets and their Licenses , 2011 .

[31]  Gerard de Melo,et al.  Exploratory Information Extraction from a Historical Dictionary , 2014, 2014 IEEE 10th International Conference on e-Science.

[32]  Natalja Friesen,et al.  Semantic-Based Retrieval of Cultural Heritage Multimedia Objects , 2012, Int. J. Semantic Comput..

[33]  Craig A. Knoblock,et al.  Connecting the Smithsonian American Art Museum to the Linked Data Cloud , 2013, ESWC.

[34]  Dan Brickley,et al.  FOAF Vocabulary Specification , 2004 .