On Warehouses, Lakes, and Spaces: The Changing Role of Conceptual Modeling for Data Integration

The role of conceptual models, their formalization and implementation as knowledge bases, and the related metadata and metamodel management, has continuously evolved since their inception in the late 1970s. In this paper, we trace this evolution from traditional database design, to data warehouse integration, to the recent data lake architectures. Concerning future developments, we argue that much of the research has perhaps focused too much on the design perspective of individual companies or strongly managed centralistic company networks, culminating in today’s huge oligopolistic web players, and propose a vision of interacting data spaces which seems to offer more sovereignty of small and medium enterprises over their own data.

[1]  Cristina Gómez,et al.  Enforcement of Conceptual Schema Quality Issues in Current Integrated Development Environments , 2013, CAiSE.

[2]  Sandra Geisler,et al.  Automatic schema merging using mapping constraints among incomplete sources , 2010, CIKM.

[3]  Philip A. Bernstein,et al.  A vision for management of complex models , 2000, SGMD.

[4]  Christoph Quix,et al.  Generic Schema Merging , 2007, CAiSE.

[5]  Catriel Beeri,et al.  A Proof Procedure for Data Dependencies , 1984, JACM.

[6]  Matthias Jarke,et al.  Toward Reference Models of Requirements Traceability , 2001, IEEE Trans. Software Eng..

[7]  Matthias Jarke,et al.  Repository Support for Multi-Perspective Requirements Engineering , 1999, Inf. Syst..

[8]  Paolo Papotti,et al.  Nested mappings: schema mapping reloaded , 2006, VLDB.

[9]  Erhard Rahm,et al.  Developing metadata-intensive applications with Rondo , 2003, J. Web Semant..

[10]  Philip A. Bernstein,et al.  Model-independent schema translation , 2008, The VLDB Journal.

[11]  Michael L. Brodie Data Integration at Scale: From Relational Data Integration to Information Ecosystems , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[12]  Christoph Quix,et al.  Metadata Extraction and Management in Data LakesWith GEMMS , 2016, Complex Syst. Informatics Model. Q..

[13]  Diego Calvanese,et al.  Data Integration in Data Warehousing (Keynote Address) , 2001, CAiSE Workshops.

[14]  Matthias Jarke,et al.  GeRoMe: A Generic Role Based Metamodel for Model Management , 2005, OTM Conferences.

[15]  Manfred A. Jeusfeld,et al.  Anderungskontrolle in deduktiven Objektbanken , 1992 .

[16]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS '04.

[17]  Matthias Jarke,et al.  Telos: representing knowledge about information systems , 1990, TOIS.

[18]  Christoph Quix,et al.  Data Lakes: A Solution or a new Challenge for Big Data Integration? , 2016, DATA.

[19]  Sandra Geisler,et al.  Constance: An Intelligent Data Lake System , 2016, SIGMOD Conference.

[20]  Antoni Olivé,et al.  A Framework for the Evolution of Temporal Conceptual Schemas of Information Systems , 2000, CAiSE.

[21]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[22]  Antoni Olivé,et al.  An approach to website schema.org design , 2015, Data Knowl. Eng..

[23]  Jérôme Euzenat,et al.  A Survey of Schema-Based Matching Approaches , 2005, J. Data Semant..

[24]  Erhard Rahm,et al.  Rondo: a programming platform for generic model management , 2003, SIGMOD '03.

[25]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[26]  Antoni Olivé,et al.  On the design and implementation of information systems from deductive conceptual models , 1989, VLDB.

[27]  Antoni Olivé,et al.  Updating knowledge bases while maintaining their consistency , 1995, The VLDB Journal.

[28]  Matthias Jarke,et al.  View Management Support in Advanced Knowledge Base Servers , 2000, Journal of Intelligent Information Systems.

[29]  Matthias Jarke,et al.  Interactive Pay-As-You-Go-Integration of Life Science Data: The HUMIT Approach , 2016, ERCIM News.

[30]  Matthias Jarke,et al.  ConceptBase — A deductive object base for meta data management , 1995, Journal of Intelligent Information Systems.

[31]  Christoph Quix,et al.  Merging Relational Views: A Minimization Approach , 2011, ER.

[32]  Antoni Olivé Conceptual Modeling in Agile Information Systems Development , 2014, ICEIS.

[33]  Jérôme Euzenat,et al.  Ontology Matching: State of the Art and Future Challenges , 2013, IEEE Transactions on Knowledge and Data Engineering.

[34]  Christoph Quix,et al.  Matching of Ontologies with XML Schemas Using a Generic Metamodel , 2007, OTM Conferences.

[35]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[36]  Christoph Quix,et al.  Transformation of Models in(to) a Generic Metamodel , 2007, BTW Workshops.

[37]  Matthias Jarke,et al.  Generic schema mappings for composition and query answering , 2009, Data Knowl. Eng..

[38]  Luigi Bellomarini,et al.  MISM: A Platform for Model-Independent Solutions to Model Management Problems , 2009, J. Data Semant..

[39]  Laura M. Haas,et al.  Clio: a semi-automatic tool for schema mapping , 2001, SIGMOD '01.

[40]  Ronald Fagin Tuple-Generating Dependencies , 2009, Encyclopedia of Database Systems.

[41]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[42]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[43]  Silvana Castano,et al.  Semantic integration of heterogeneous information sources , 2001, Data Knowl. Eng..

[44]  John Mylopoulos,et al.  Strategic business modeling: representation and reasoning , 2014, Software & Systems Modeling.

[45]  Carlo Batini,et al.  Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.

[46]  Matthias Jarke,et al.  View-Based Near Real-Time Collaborative Modeling for Information Systems Engineering , 2016, CAiSE.

[47]  Laura M. Haas,et al.  Clio: Schema Mapping Creation and Data Exchange , 2009, Conceptual Modeling: Foundations and Applications.

[48]  Yong Li,et al.  GeRoMeSuite: A System for Holistic Generic Model Management , 2007, VLDB.

[49]  Wolfgang Klas,et al.  A survey of techniques for achieving metadata interoperability , 2010, CSUR.

[50]  Matthias Jarke,et al.  Architecture and Quality in Data Warehouses: An Extended Repository Approach , 1999, Information Systems.

[51]  Norbert Ritter,et al.  Scalable data management: NoSQL data stores in research and practice , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[52]  Antoni Olivé,et al.  An object-oriented operation-based approach to translation between MOF metaschemas , 2008, Data Knowl. Eng..

[53]  Laura M. Haas,et al.  Clio grows up: from research prototype to industrial tool , 2005, SIGMOD '05.