Biblio-transformation-engine: An open source framework and use cases in the digital libraries domain

In the course of developing digital libraries, repositories and archives, a constantly recurring requirement is the transformation of data between diverse formats in order to satisfy various needs, which may arise both in-house as well as be raised by end user groups, or simply by technological evolutions. Additionally, the data itself is more important than the code that handles it, and therefore the code changes far more frequently than the data. This highlights the necessity for reusable software that focuses on data transformations. Stemming from this observation, the hereby presented approach aims at facilitating the often encountered transformation tasks. The source data can be in the form of records in a legacy database, conforming to deprecated formats, or simply satisfying internal ad hoc needs. The target of the transformation is more usually a step towards opening the data, enabling data integration with other sources, by conforming to widely adopted standards and practices. An important observation that provided the motivation for our work is that the overall data transformation task consists of components that can be engineered in a modular way to be largely independent of each other and to achieve a high degree of reuse of some of them, even in very different real-life cases. Therefore, we have created the biblio transformation engine, an open source framework that enables finegrained modularity and injection of functional elements in a way that allows heavy reuse and thus faster development of transformation tasks. The document is structured as follows: Section 2 provides details on the biblio transformation engine, the tool developed and used in the transformation process in order to alleviate the required corresponding effort. Section 3 illustrates several use cases in real-world applications, while Section 4 concludes the paper by gathering our most important observations and future plans.