Fundamentals of Business Intelligence

This book presents a comprehensive and systematic introduction to transforming process-oriented data into information about the underlying business process, which is essential for all kinds of decision-making. To that end, the authors develop step-by-step models and analytical tools for obtaining high-quality data structured in such a way that complex analytical tools can be applied. The main emphasis is on process mining and data mining techniques and the combination of these methods for process-oriented data. After a general introduction to the business intelligence (BI) process and its constituent tasks in chapter 1, chapter 2 discusses different approaches to modeling in BI applications. Chapter 3 is an overview and provides details of data provisioning, including a section on big data. Chapter 4 tackles data description, visualization, and reporting. Chapter 5 introduces data mining techniques for cross-sectional data. Different techniques for the analysis of temporal data are then detailed in Chapter 6. Subsequently, chapter 7 explains techniques for the analysis of process data, followed by the introduction of analysis techniques for multiple BI perspectives in chapter 8. The book closes with a summary and discussion in chapter 9. Throughout the book, (mostly open source) tools are recommended, described and applied; a more detailed survey on tools can be found in the appendix, and a detailed code for the solutions together with instructions on how to install the software used can be found on the accompanying website. Also, all concepts presented are illustrated and selected examples and exercises are provided. The book is suitable for graduate students in computer science, and the dedicated website with examples and solutions makes the book ideal as a textbook for a first course in business intelligence in computer science or business information systems. Additionally, practitioners and industrial developers who are interested in the concepts behind business intelligence will benefit from the clear explanations and many examples.

[1]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[2]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[3]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[4]  Edward Rolf Tufte,et al.  The visual display of quantitative information , 1985 .

[5]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[6]  Dieter Jungnickel,et al.  Graphen, Netzwerke und Algorithmen , 1987 .

[7]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..

[8]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[9]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[10]  R. Tibshirani,et al.  Generalized Additive Models , 1991 .

[11]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[12]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[13]  R. Kaplan,et al.  The balanced scorecard--measures that drive performance. , 2015, Harvard business review.

[14]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[15]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[16]  Kerim Tumay Business process simulation , 1995, WSC '95.

[17]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[18]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[19]  Ravi S. Sandhu,et al.  Role-Based Access Control Models , 1996, Computer.

[20]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[21]  Richard D. Deveaux,et al.  Applied Smoothing Techniques for Data Analysis , 1999, Technometrics.

[22]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[23]  S. Dongen Graph clustering by flow simulation , 2000 .

[24]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[25]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[26]  Fabio Casati,et al.  Improving Business Process Quality through Exception Understanding, Prediction, and Prevention , 2001, VLDB.

[27]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[28]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[29]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[30]  Cláudia Antunes,et al.  Temporal Data Mining: an overview , 2001 .

[31]  John F. Roddick,et al.  A Survey of Temporal Knowledge Discovery Paradigms and Methods , 2002, IEEE Trans. Knowl. Data Eng..

[32]  Marc Dacier,et al.  Mining intrusion detection alarms for actionable knowledge , 2002, KDD.

[33]  Pierre Baldi,et al.  Modeling the Internet and the Web: Probabilistic Methods and Algorithms: Baldi/Probabilistic , 2002 .

[34]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[35]  Wil M. P. van der Aalst,et al.  Workflow Mining: Current Status and Future Directions , 2003, OTM.

[36]  Akhil Kumar,et al.  W-RBAC - A Workflow Security Model Incorporating Controlled Overriding of Constraints , 2003, Int. J. Cooperative Inf. Syst..

[37]  Akiko Aizawa,et al.  An information-theoretic perspective of tf-idf measures , 2003, Inf. Process. Manag..

[38]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[39]  Jean-Philippe Vert,et al.  1 A primer on kernel methods , 2004 .

[40]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[41]  Peter Dadam,et al.  Correctness criteria for dynamic changes in workflow systems - a survey , 2004, Data Knowl. Eng..

[42]  Vladimir Batagelj,et al.  Pajek - Analysis and Visualization of Large Networks , 2001, Graph Drawing Software.

[43]  Francis D. Tuggle,et al.  Strategy Maps: Converting Intangible Assets into Tangible Outcomes , 2004 .

[44]  Stefanie Rinderle-Ma,et al.  Integrating Process Learning and Process Evolution - A Semantics Based Approach , 2005, Business Process Management.

[45]  Manfred Reichert,et al.  Requirements for the visualization of system-spanning business processes , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[46]  Hajo A. Reijers,et al.  Best practices in business process redesign: validation of a redesign framework , 2005, Comput. Ind..

[47]  Peter Dadam,et al.  Mining Staff Assignment Rules from Event-Based Data , 2005, Business Process Management Workshops.

[48]  Hajo A. Reijers,et al.  Best practices in business process redesign: an overview and qualitative evaluation of successful redesign heuristics , 2005 .

[49]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[50]  Wil M. P. van der Aalst,et al.  Decision Mining in ProM , 2006, Business Process Management.

[51]  B. Everitt,et al.  A Handbook of Statistical Analyses using R , 2006 .

[52]  Stefanie Rinderle-Ma,et al.  Business Process Visualization - Use Cases, Challenges, Solutions , 2006, ICEIS.

[53]  Stefanie Rinderle-Ma,et al.  A Framework for Semantic Recovery Strategies in Case of Process Activity Failures , 2006, ICEIS.

[54]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[55]  Michael Friendly,et al.  Visual Statistics: Seeing Data with Dynamic Interactive Graphics , 2006 .

[56]  P. S. Sastry,et al.  A survey of temporal data mining , 2006 .

[57]  Thilini Ariyachandra,et al.  Business Performance Management: One Truth , 2005, Inf. Syst. Manag..

[58]  Heike Hofmann,et al.  Graphics of Large Datasets: Visualizing a Million , 2006 .

[59]  Kôiti Hasida,et al.  POLYPHONET: an advanced social network extraction system from the web , 2006, WWW '06.

[60]  Stefanie Rinderle-Ma,et al.  On Representing, Purging, and Utilizing Change Logs in Process Management Systems , 2006, Business Process Management.

[61]  A. Akhmetova Discovery of Frequent Episodes in Event Sequences , 2006 .

[62]  Manfred Reichert,et al.  Data-Driven Modeling and Coordination of Large Process Structures , 2007, OTM Conferences.

[63]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[64]  Mathias Weske,et al.  Business Process Management: Concepts, Languages, Architectures , 2007 .

[65]  Martin Wattenberg,et al.  ManyEyes: a Site for Visualization at Internet Scale , 2007, IEEE Transactions on Visualization and Computer Graphics.

[66]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[67]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[68]  Mark Bailey,et al.  The Grammar of Graphics , 2007, Technometrics.

[69]  Stefanie Rinderle-Ma,et al.  A Formal Framework for Adaptive Access Control Models , 2007, J. Data Semant..

[70]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[71]  Hajo A. Reijers,et al.  Best practices in business process redesign: use and impact , 2007, Bus. Process. Manag. J..

[72]  Manfred Reichert,et al.  View-Based Process Visualization , 2007, BPM.

[73]  Wil M. P. van der Aalst,et al.  The Need for a Process Mining Evaluation Framework in Research and Practice , 2007, Business Process Management Workshops.

[74]  Tania Tudorache,et al.  Collaborative Ontology Development on the (Semantic) Web , 2008, AAAI Spring Symposium: Symbiotic Relationships between Semantic Web and Knowledge Engineering.

[75]  Jan Recker,et al.  Using process mining to learn from process changes in evolutionary systems , 2008, Int. J. Bus. Process. Integr. Manag..

[76]  Kurt Hornik,et al.  Text Mining Infrastructure in R , 2008 .

[77]  A. Teixeira,et al.  Surveying structural change: Seminal contributions and a bibliometric account , 2008 .

[78]  Stefanie Rinderle-Ma,et al.  Change patterns and change support features - Enhancing flexibility in process-aware information systems , 2008, Data Knowl. Eng..

[79]  Stephen R. Marsland,et al.  Machine Learning - An Algorithmic Perspective , 2009, Chapman and Hall / CRC machine learning and pattern recognition series.

[80]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[81]  Toni Giorgino,et al.  Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package , 2009 .

[82]  Stefanie Rinderle-Ma,et al.  Providing Integrated Life Cycle Support in Process-Aware Information Systems , 2009, Int. J. Cooperative Inf. Syst..

[83]  Xindong Wu,et al.  The Top Ten Algorithms in Data Mining , 2009 .

[84]  Daniel Gillblad,et al.  Discovering Process Models from Unlabelled Event Logs , 2009, BPM.

[85]  Manfred Reichert,et al.  Flexibility in Process-Aware Information Systems , 2009, Trans. Petri Nets Other Model. Concurr..

[86]  Michael Friendly,et al.  Where's Waldo? Visualizing Collinearity Diagnostics , 2009 .

[87]  Wil M. P. van der Aalst,et al.  Beyond Process Mining: From the Past to Present and Future , 2010, CAiSE.

[88]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[89]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[90]  David M. Blei,et al.  Introduction to Probabilistic Topic Models , 2010 .

[91]  Stefan Fritsch,et al.  neuralnet: Training of Neural Networks , 2010, R J..

[92]  Michael W. Berry,et al.  Text mining : applications and theory , 2010 .

[93]  Jeffrey Heer,et al.  A Tour through the Visualization Zoo , 2010 .

[94]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[95]  Nizar R. Mabroukeh,et al.  A taxonomy of sequential pattern mining algorithms , 2010, CSUR.

[96]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[97]  Kurt Hornik,et al.  topicmodels : An R Package for Fitting Topic Models , 2016 .

[98]  Sonja Kabicher,et al.  Visual Change Tracking for Business Process Models , 2011, ER.

[99]  A. J. M. M. Weijters,et al.  Flexible Heuristics Miner (FHM) , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[100]  Mykola Pechenizkiy,et al.  Handling Concept Drift in Process Mining , 2011, CAiSE.

[101]  Mathias Weske,et al.  Business process model abstraction: a definition, catalog, and survey , 2012, Distributed and Parallel Databases.

[102]  Mark Strembeck,et al.  A Case Study on the Suitability of Process Mining to Produce Current-State RBAC Models , 2012, Business Process Management Workshops.

[103]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[104]  Ben Shneiderman,et al.  Interactive Dynamics for Visual Analysis , 2012 .

[105]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[106]  Sonja Kabicher,et al.  Timeline Visualization for Documenting Process Model Change , 2012, EMISA.

[107]  Ian Molloy,et al.  Generative models for access control policies: applications to role mining over logs with attribution , 2012, SACMAT '12.

[108]  Stefanie Rinderle-Ma,et al.  Beyond visualization: on using sonification methods to make business processes more accessible to users , 2012 .

[109]  Stefanie Rinderle-Ma,et al.  Who is who: On visualizing organizational models in Collaborative Systems , 2012, 8th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[110]  Stefanie Rinderle-Ma,et al.  Change Visualizations in Business Processes - Requirements Analysis , 2012, GRAPP/IVAPP.

[111]  Stefanie Rinderle-Ma,et al.  On Analyzing Process Compliance in Skin Cancer Treatment: An Experience Report from the Evidence-Based Medical Compliance Cluster (EBMC2) , 2012, CAiSE.

[112]  Peter Dadam,et al.  On enabling integrated process compliance with semantic constraints in process management systems , 2012, Inf. Syst. Frontiers.

[113]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[114]  Diogo R. Ferreira,et al.  Business process analysis in healthcare environments: A methodology based on process mining , 2012, Inf. Syst..

[115]  Jörg Becker,et al.  An Empirical Assessment of the Usefulness of Weakness Patterns in Business Process Redesign , 2012, ECIS.

[116]  Manfred Reichert,et al.  Enabling Flexibility in Process-Aware Information Systems: Challenges, Methods, Technologies , 2012 .

[117]  Charu C. Aggarwal,et al.  Mining Text Data , 2012, Springer US.

[118]  John Elder,et al.  Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[119]  Marco Montali,et al.  A Framework for the Systematic Comparison and Evaluation of Compliance Monitoring Approaches , 2013, 2013 17th IEEE International Enterprise Distributed Object Computing Conference.

[120]  Vignesh Prajapati,et al.  Big Data Analytics with R and Hadoop , 2013 .

[121]  M. Bebbington Event History Analysis with R. By Göran Boström. Boca Raton, Florida: CRC Press. 2012. 236 pages. £49.99 (hardback). ISBN 9781439831649 , 2013 .

[122]  Stefanie Rinderle-Ma,et al.  A Visualization Approach for Difference Analysis of Process Models and Instance Traffic , 2013, BPM.

[123]  Günter Müller,et al.  On the exploitation of process mining for security audits: the process discovery case , 2013, SAC '13.

[124]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[125]  Stefanie Rinderle-Ma,et al.  Dynamic instance queuing in process-aware information systems , 2013, SAC '13.

[126]  Marlon Dumas,et al.  Discovering Branching Conditions from Business Process Execution Logs , 2013, FASE.

[127]  Rafael Accorsi,et al.  SecSy: Synthesizing Smart Process Event Logs , 2013, EMISA.

[128]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[129]  Stefanie Rinderle-Ma,et al.  Anomaly detection and visualization in generative RBAC models , 2014, SACMAT '14.

[130]  Claire Cardie,et al.  Towards a General Rule for Identifying Deceptive Opinion Spam , 2014, ACL.

[131]  Stefanie Rinderle-Ma,et al.  Decision Point Analysis of Time Series Data in Process-Aware Information Systems , 2014, CAiSE.

[132]  Erhard Rahm,et al.  Enriching ontology mappings with semantic relations , 2014, Data Knowl. Eng..

[133]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .