Prescriptive analytics: a survey of emerging trends and technologies

AbstractThis paper provides a survey of the state-of-the-art and future directions of one of the most important emerging technologies within business analytics (BA), namely prescriptive analytics (PSA). BA focuses on data-driven decision-making and consists of three phases: descriptive, predictive, and prescriptive analytics. While descriptive and predictive analytics allow us to analyze past and predict future events, respectively, these activities do not provide any direct support for decision-making. Here, PSA fills the gap between data and decisions. We have observed an increasing interest for in-DBMS PSA systems in both research and industry. Thus, this paper aims to provide a foundation for PSA as a separate field of study. To do this, we first describe the different phases of BA. We then survey classical analytics systems and identify their main limitations for supporting PSA, based on which we introduce the criteria and methodology used in our analysis. We next survey, categorize, and discuss the state-of-the-art within emerging, so-called PSA$$^+$$+, systems, followed by a presentation of the main challenges and opportunities for next-generation PSA systems. Finally, the main findings are discussed and directions for future research are outlined.

[1]  Der-San Chen,et al.  Applied Integer Programming: Modeling and Solution , 2010 .

[2]  Torben Bach Pedersen,et al.  Adaptive User-Oriented Direct Load-Control of Residential Flexible Devices , 2018, e-Energy.

[3]  Tarun Kumar,et al.  Asset health management using predictive and prescriptive analytics for the electric power grid , 2016, IBM J. Res. Dev..

[4]  Volker Markl,et al.  Breaking the Chains: On Declarative Data Analysis and Data Independence in the Big Data Era , 2014, Proc. VLDB Endow..

[5]  Bo Thiesson,et al.  Utilizing Device-level Demand Forecasting for Flexibility Markets , 2018, e-Energy.

[6]  M Christopher Roebuck,et al.  Predictive Modeling of Total Healthcare Costs Using Pharmacy Claims Data: A Comparison of Alternative Econometric Cost Modeling Techniques , 2005, Medical care.

[7]  Robert E. Bixby,et al.  Solving Real-World Linear Programs: A Decade and More of Progress , 2002, Oper. Res..

[8]  Vicki L. Sauter,et al.  Decision Support Systems for Business Intelligence , 2011 .

[9]  Hans-Arno Jacobsen,et al.  Adaptive middleware for real-time prescriptive analytics in large scale power systems , 2013, Middleware Industry '13.

[10]  S. Fawcett,et al.  Data Science, Predictive Analytics, and Big Data: A Revolution that Will Transform Supply Chain Design and Management , 2013 .

[11]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[12]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  John M. Wilson,et al.  Introduction to Stochastic Programming , 1998, J. Oper. Res. Soc..

[15]  A. Gray,et al.  Modern Differential Geometry of Curves and Surfaces with Mathematica, Third Edition (Studies in Advanced Mathematics) , 2006 .

[16]  Sohini Roychowdhury,et al.  A generalized flow for multi-class and binary classification tasks: An Azure ML approach , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[17]  Torben Bach Pedersen,et al.  Dependency-based FlexOffers: scalable management of flexible loads with dependencies , 2016, e-Energy.

[18]  Sean Owen,et al.  Mahout in Action , 2011 .

[19]  Todd J. Green,et al.  LogicBlox, Platform and Language: A Tutorial , 2012, Datalog.

[20]  Zhengxin Chen,et al.  Data Warehousing, OLAP and Data Mining , 1999 .

[21]  Yuval Rabani,et al.  Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[22]  Share,et al.  The ANSI/SPARC DBMS model : proceedings of the second SHARE Working Conference on Data Base Management Systems, Montreal, Canada, April 26-30, 1976 , 1977 .

[23]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[24]  Dan Suciu,et al.  Tiresias: the database oracle for how-to queries , 2012, SIGMOD Conference.

[25]  Paul P. Maglio,et al.  Data is dead... without what-if models , 2011, Proc. VLDB Endow..

[26]  Dan Suciu,et al.  Reverse data management , 2011, Proc. VLDB Endow..

[27]  Gilvan C. Souza,et al.  Supply Chain Analytics , 2016 .

[28]  Eric R. Zieyel Operations research : applications and algorithms , 1988 .

[29]  Paul G. Brown,et al.  Overview of sciDB: large scale array storage, processing and analysis , 2010, SIGMOD Conference.

[30]  Pei-Ju Wu,et al.  The green fleet optimization model for a low-carbon economy: A prescriptive analytics , 2017, 2017 International Conference on Applied System Innovation (ICASI).

[31]  Jeffrey F. Naughton,et al.  Model Selection Management Systems: The Next Frontier of Advanced Analytics , 2016, SGMD.

[32]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[33]  Wolfgang Lehner,et al.  F2DB: The Flash-Forward Database System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[34]  Shirish Tatikonda,et al.  SystemML: Declarative machine learning on MapReduce , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[35]  Carsten Binnig,et al.  An Architecture for Compiling UDF-centric Workflows , 2015, Proc. VLDB Endow..

[36]  Johannes Gehrke,et al.  Database management systems (3. ed.) , 2003 .

[37]  Berthold Reinwald,et al.  Declarative Machine Learning - A Classification of Basic Properties and Types , 2016, ArXiv.

[38]  Tim Kraska,et al.  MLbase: A Distributed Machine-learning System , 2013, CIDR.

[39]  Florin Gorunescu,et al.  Data Mining - Concepts, Models and Techniques , 2011, Intelligent Systems Reference Library.

[40]  Alan Edelman,et al.  Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[41]  Robert Stackowiak,et al.  Oracle Data Warehousing and Business Intelligence Solutions , 2007 .

[42]  Anders L. Madsen,et al.  The Hugin Tool for Probabilistic Graphical Models , 2005, Int. J. Artif. Intell. Tools.

[43]  Hendrik Blockeel,et al.  Data Mining: From Procedural to Declarative Approaches , 2015, New Generation Computing.

[44]  Torben Bach Pedersen,et al.  Demonstrating SolveDB: An SQL-Based DBMS for Optimization Applications , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[45]  Sven Van Poucke,et al.  Are Randomized Controlled Trials the (G)old Standard? From Clinical Intelligence to Prescriptive Analytics , 2016, Journal of medical Internet research.

[46]  Ameet Talwalkar,et al.  MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..

[47]  F. Burstein,et al.  Handbook on Decision Support Systems 1 , 2008 .

[48]  Petr Vilím,et al.  IBM ILOG CP optimizer for scheduling , 2018, Constraints.

[49]  Peter G. W. Keen,et al.  Decision support systems : an organizational perspective , 1978 .

[50]  Christopher Ré,et al.  Towards a unified architecture for in-RDBMS analytics , 2012, SIGMOD Conference.

[51]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[52]  Torben Bach Pedersen,et al.  Prescriptive Analytics , 2018, Encyclopedia of Database Systems.

[53]  Torben Bach Pedersen,et al.  Aggregating and Disaggregating Flexibility Objects , 2012, IEEE Transactions on Knowledge and Data Engineering.

[54]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[55]  Bernhard Mitschang,et al.  Prescriptive Analytics for Recommendation-Based Business Process Optimization , 2014, BIS.

[56]  E. Rowland Watkins Principles of the business rule approach: Ronald G. Ross, Addison-Wesley Information Technology Series, February 2003, 256pp., price £30.99, ISBN 0-201-78893-4 , 2004, Int. J. Inf. Manag..

[57]  Clyde W. Holsapple,et al.  Handbook on Decision Support Systems 2: Variations , 2008 .

[58]  Torben Bach Pedersen,et al.  Generation and Evaluation of Flex-Offers from Flexible Electrical Devices , 2017, e-Energy.

[59]  Timos Sellis,et al.  Prescriptive Analytics for Big Data , 2016, ADC.

[60]  Kulwinder Singh Mann,et al.  AI based HealthCare Platform for Real Time, Predictive and Prescriptive Analytics using Reactive Programming , 2018 .

[61]  Torben Bach Pedersen,et al.  SolveDB: Integrating Optimization Problem Solvers Into SQL Databases , 2016, SSDBM.

[62]  Andrew B. Whinston,et al.  Foundations of Decision Support Systems , 1981 .

[63]  Emir Pasalic,et al.  Design and Implementation of the LogicBlox System , 2015, SIGMOD Conference.

[64]  Rob J Hyndman,et al.  25 years of time series forecasting , 2006 .

[65]  Esteban Zimányi,et al.  Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications , 2010 .

[66]  Clyde W. Holsapple,et al.  A unified foundation for business analytics , 2014, Decis. Support Syst..

[67]  Hans Peter Luhn,et al.  A Business Intelligence System , 1958, IBM J. Res. Dev..

[68]  Torben Bach Pedersen,et al.  Model-based Integration of Past & Future in TimeTravel , 2012, Proc. VLDB Endow..

[69]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[70]  J. Wyatt Decision support systems. , 2000, Journal of the Royal Society of Medicine.

[71]  Jeffrey F. Naughton,et al.  A Survey of the Existing Landscape of ML Systems , 2015 .

[72]  Torben Bach Pedersen,et al.  Aggregating energy flexibilities under constraints , 2016, 2016 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[73]  Wolfgang Lehner,et al.  Towards Integrated Data Analytics: Time Series Forecasting in DBMS , 2012, Datenbank-Spektrum.

[74]  Dimitris Bertsimas,et al.  From Predictive to Prescriptive Analytics , 2014, Manag. Sci..

[75]  Zhaohui Tang,et al.  Data Mining with SQL Server 2005 , 2005 .

[76]  Kun Li,et al.  The MADlib Analytics Library or MAD Skills, the SQL , 2012, Proc. VLDB Endow..

[77]  Diego Ravazzolo,et al.  Fleet asset capacity analysis and revenue management optimization using advanced prescriptive analytics , 2016 .

[78]  Alexandra Meliou,et al.  Scalable Package Queries in Relational Database Systems , 2015, Proc. VLDB Endow..

[79]  M. Jarke,et al.  Fundamentals of Data Warehouses , 2003, Springer Berlin Heidelberg.

[80]  Won-Kyung Sung,et al.  Prescriptive Analytics System for Improving Research Power , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[81]  Jignesh M. Patel,et al.  Enabling JSON Document Stores in Relational Systems , 2013, WebDB.

[82]  Patrick Shafto,et al.  BayesDB: A probabilistic programming system for querying the probable implications of data , 2015, ArXiv.

[83]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[84]  Tim Kraska,et al.  Tupleware: "Big" Data, Big Analytics, Small Clusters , 2015, CIDR.

[85]  R. Boire Predictive analytics: The power to predict who will click, buy, lie, or die , 2013 .

[86]  Ralph Kimball,et al.  The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling , 1996 .

[87]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[88]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[89]  Alfred Gray,et al.  Modern differential geometry of curves and surfaces with Mathematica (2. ed.) , 1998 .

[90]  Christer Carlsson,et al.  Past, present, and future of decision support technology , 2002, Decis. Support Syst..

[91]  Eli Upfal,et al.  The Case for Predictive Database Systems: Opportunities and Challenges , 2011, CIDR.

[92]  Philip J. Pritchard Mathcad : a tool for engineering problem solving , 1998 .

[93]  Stanley B. Zdonik,et al.  Searchlight: Enabling Integrated Search and Exploration over Large Multidimensional Data , 2015, Proc. VLDB Endow..

[94]  Laurynas Siksnys Towards Prescriptive Analytics in Cyber-Physical Systems , 2014 .

[95]  Septimiu Nechifor,et al.  Prescriptive Analytics Based Autonomic Networking for Urban Streams Services Provisioning , 2015, 2015 IEEE 81st Vehicular Technology Conference (VTC Spring).

[96]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[97]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[98]  Ramesh Sharda,et al.  Reflections on the Past and Future of Decision Support Systems: Perspective of Eleven Pioneers , 2011, Decision Support - An Examination of the DSS Discipline.

[99]  Gerardine DeSanctis,et al.  A foundation for the study of group decision support systems , 1987 .

[100]  Steven C. Wheelwright,et al.  Forecasting methods and applications. , 1979 .

[101]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.