Search and Analytics Challenges in Digital Libraries and Archives

Public institutions, such as universities, maintain data in several information silos, each of them engineered to serve a specific vertical application. Data about key entities—such as people, publications, courses, projects—is scattered across them and difficult to correlate due to the diversity in format, metadata, conventions, and terminology used. In such a scenario, nowadays it is practically impossible to correlate data and support advanced search and analytics facilities, in turn vital to identify institutional priorities and support institutional strategic goals, as well as to offer effective data visualization and navigation services to their users (e.g., researchers, students, alumni, companies). A catalogue, in libraries and archives, is a collection of organized data describing the information content managed by an institution [Patton 2009]. Cataloging is the process (guided by rigorous rules) that information scientists follow to create and maintain metadata in order to effectively represent and exploit information content. The most widespread library data models are still traditional record-based models, i.e., models that bundle information about the same entity into a single record. The advent of the Web opened boundless opportunities to information seekers, especially in terms of quantity of information and abundance of search tools. This has brought libraries and their cataloguing practices to a crisis point [Coyle and Hillmann 2007]. The enhanced users’ expectations led them to embrace the Semantic Web vision [Berners-Lee et al. 2001]. It advocates that representing data in a uniform machinereadable format with explicit meaning allows the development of intelligent interconnected services, which are able to get and aggregate data from different sources. Libraries started adopting the Linked Data approach that in turn is leading to a paradigm shift from record-based to entity-based models, i.e., models in which relevant entities are assigned URIs and are described in terms of subject-property-object triples. All together triples form a knowledge graph. The extent to which this is happening is nicely described in Alemu et al. [2012] and Martin and Mundle [2014]. Active institutions

[1]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[2]  George Ghinea,et al.  Organisational challenges of the semantic web in digital libraries: a Norwegian case study , 2009, Online Inf. Rev..

[3]  Joshua Barton,et al.  Old Hopes, New Possibilities: Next-Generation Catalogues and the Centralization of Access , 2012, Libr. Trends.

[4]  Glenn E. Patton Functional Requirements for Authority Data: A Conceptual Model , 2009 .

[5]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[6]  Thomas J. Steenburgh,et al.  Motivating Salespeople: What Really Works , 2012, Harvard business review.

[7]  Glenn E. Patton Members of the IFLA Working Group on Functional Requirements and Numbering of Authority Records , 2009 .

[8]  Karen Coyle,et al.  Resource Description and Access (RDA): Cataloging Rules for the 20th Century , 2007 .

[9]  Fausto Giunchiglia,et al.  From Knowledge Organization to Knowledge Representation , 2014 .

[10]  T. Davenport,et al.  Data scientist: the sexiest job of the 21st century. , 2012, Harvard business review.

[11]  Getaneh Alemu,et al.  Linked data for libraries: benefits of a conceptual shift from library-specific record structures to RDF-based data models , 2012 .

[12]  Mirosław Kutyłowski,et al.  ICT Systems Security and Privacy Protection , 2018, IFIP Advances in Information and Communication Technology.

[13]  Michael Teets,et al.  Libraries' Role in Curating and Exposing Big Data , 2013, Future Internet.

[14]  Kristin E. Martin,et al.  Positioning Libraries for a New Bibliographic Universe , 2014 .

[15]  Lisa Goddard,et al.  The Strongest Link: Libraries and Linked Data , 2010, D Lib Mag..

[16]  Jaap-Henk Hoepman,et al.  PDF hosted at the Radboud Repository of the Radboud University Nijmegen , 2022 .