A Markov-Based Update Policy for Constantly Changing Database Systems

In order to maximize the value of an organization's data assets, it is important to keep data in its databases up-to-date. In the era of big data, however, constantly changing data sources make it a challenging task to assure data timeliness in enterprise systems. For instance, due to the high frequency of purchase transactions, purchase data stored in an enterprise resource planning system can easily become outdated, affecting the accuracy of inventory data and the quality of inventory replenishment decisions. Despite the importance of data timeliness, updating a database as soon as new data arrives is typically not optimal because of high update cost. Therefore, a critical problem in this context is to determine the optimal update policy for database systems. In this study, we develop a Markov decision process model, solved via dynamic programming, to derive the optimal update policy that minimizes the sum of data staleness cost and update cost. Based on real-world enterprise data, we conduct experiments to evaluate the performance of the proposed update policy in relation to benchmark policies analyzed in the prior literature. The experimental results show that the proposed update policy outperforms fixed interval update policies and can lead to significant cost savings.

[1]  Zhongju Zhang,et al.  Optimal Synchronization Policies for Data Warehouses , 2006, INFORMS J. Comput..

[2]  Zhiping Walter,et al.  A framework for data warehouse refresh policies , 2006, Decis. Support Syst..

[3]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[4]  Hector Garcia-Molina,et al.  Applying update streams in a soft real-time database system , 1995, SIGMOD '95.

[5]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[6]  K. Mani Chandy,et al.  Analytic models for rollback and recovery strategies in data base systems , 1975, IEEE Transactions on Software Engineering.

[7]  Pranab K. Dan,et al.  An approach to identify issues affecting ERP implementation in Indian SMEs , 2012 .

[8]  Viswanath Venkatesh,et al.  Job Characteristics and Job Satisfaction: Understanding the Role of Enterprise Resource , 2010, MIS Q..

[9]  Jie Wang,et al.  PL-Tree: An Efficient Indexing Method for High-Dimensional Data , 2013, SSTD.

[10]  Jaideep Srivastava,et al.  Analytical modeling of materialized view maintenance , 1988, PODS '88.

[11]  Verónika Peralta,et al.  A framework for analysis of data freshness , 2004, IQIS '04.

[12]  Jun-Der Leu,et al.  Investigation of the mediating effects of IT governance-value delivery on service quality and ERP performance , 2015, Enterp. Inf. Syst..

[13]  Qi Han,et al.  Addressing timeliness/accuracy/cost tradeoffs in information collection for dynamic environments , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[14]  Injun Choi,et al.  Efficiency evaluation of data warehouse operations , 2008, Decis. Support Syst..

[15]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[16]  Hongwei Zhu,et al.  A Novel Indexing Method for Improving Timeliness of High-Dimensional Data , 2014, AMCIS.

[17]  Diane M. Strong,et al.  Knowing-Why About Data Processes and Data Quality , 2004 .

[18]  Mohammad Shamsul Islam REGULATORS OF TIMELINESS DATA QUALITY DIMENSION FOR CHANGING DATA QUALITY IN INFORMATION MANUFACTURING SYSTEM (IMS) , 2013 .

[19]  Jie Chen,et al.  Timeliness evaluation of task-oriented networked space-based information system , 2011 .

[20]  Olivia R. Liu Sheng,et al.  When is the Right Time to Refresh Knowledge Discovered From Data? , 2013, Oper. Res..

[21]  Feng Wu,et al.  A discriminative and semantic feature selection method for text categorization , 2015 .

[22]  Miroslaw J. Skibniewski,et al.  Risk assessment for enterprise resource planning (ERP) system implementations: a fault tree analysis approach , 2013, Enterp. Inf. Syst..

[23]  João Eduardo Ferreira,et al.  Synchronization options for data warehouse designs , 2006, Computer.

[24]  Richard Y. Wang,et al.  Modeling Information Manufacturing Systems to Determine Information Product Quality Management Scien , 1998 .

[25]  Jie Mi,et al.  An Optimal Trade-off between Content Freshness and Refresh Cost , 2010, ArXiv.

[26]  Ram Rachamadugu,et al.  Policies for knowledge refreshing in databases , 2009 .

[27]  Ondrej Zach,et al.  ERP system implementation in SMEs: exploring the influences of the SME context , 2014, Enterp. Inf. Syst..

[28]  Bernd Heinrich,et al.  Metric-based data quality assessment - Developing and evaluating a probability-based currency metric , 2015, Decis. Support Syst..

[29]  Robert B. Handfield,et al.  Measuring the benefits of ERP on supply management maturity model: a “big data” method , 2015 .

[30]  Arie Segev,et al.  Optimal update policies for distributed materialized views , 1991 .