Business information modeling: A methodology for data-intensive projects, data science and big data governance

This paper discusses an integrated methodology to structure and formalize business requirements in large data-intensive projects, e.g. data warehouses implementations, turning them into precise and unambiguous data definitions suitable to facilitate harmonization and assignment of data governance responsibilities. We place a business information model in the center - used end-to-end from analysis, design, development, testing to data quality checks by data stewards. In addition, we show that the approach is suitable beyond traditional data warehouse environments, applying it also to big data landscapes and data science initiatives - where business requirements analysis is often neglected. As proper tool support has turned out to be inevitable in many real-world settings, we also discuss software requirements and their implementation in the Accurity Glossary tool. The approach is evaluated based on a large banking data warehouse project the authors are currently involved in.

[1]  Jan Kolter,et al.  Semiautomatische Annotation von Textdokumenten mit semantischen Metadaten , 2005, Wirtschaftsinformatik.

[2]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[3]  Douglas C. Schmidt,et al.  Guest Editor's Introduction: Model-Driven Engineering , 2006, Computer.

[4]  Sunil Soares Big Data Governance: An Emerging Imperative , 2012 .

[5]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit , 2009 .

[6]  Jose-Norberto Mazón,et al.  An MDA approach for the development of data warehouses , 2008, Decis. Support Syst..

[7]  P. E. Wisse,et al.  Semiosis & sign exchange : design for a subjective situationism, including conceptual grounds of business information modeling , 2002 .

[8]  Rachel Schutt,et al.  Doing Data Science , 2013 .

[9]  Laurian M. Chirica,et al.  The entity-relationship model: toward a unified view of data , 1975, SIGF.

[10]  Robert Winter,et al.  A method for demand-driven information requirements analysis in data warehousing projects , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[11]  Günther Pernul,et al.  Towards integrative enterprise knowledge portals , 2003, CIKM '03.

[12]  Marlena J. Gaul Big Data at Work: Dispelling the Myths, Uncovering the Opportunities , 2014 .

[13]  Ian Horrocks,et al.  OWL Web Ontology Language Reference-W3C Recommen-dation , 2004 .

[14]  L. Stein,et al.  OWL Web Ontology Language - Reference , 2004 .

[15]  Hans-Georg Kemper,et al.  Management Support with Structured and Unstructured Data—An Integrated Business Intelligence Framework , 2008, Inf. Syst. Manag..

[16]  Stephen R. Gardner Building the data warehouse , 1998, CACM.

[17]  Samir Chatterjee,et al.  A Design Science Research Methodology for Information Systems Research , 2008 .

[18]  Torsten Priebe,et al.  Reinventing the Wheel?! Why Harmonization and Reuse Fail in Complex Data Warehouse Environments and a Proposed Solution to the Problem , 2011, Wirtschaftsinformatik.

[19]  Rachel Schutt,et al.  Doing Data Science , 2013 .

[20]  Jeffrey S. Saltz,et al.  The need for new processes, methodologies and tools to support big data teams and improve big data project effectiveness , 2015, 2015 IEEE International Conference on Big Data (Big Data).