Reinforcement learning based resource allocation in business process management

Efficient resource allocation is a complex and dynamic task in business process management. Although a wide variety of mechanisms are emerging to support resource allocation in business process execution, these approaches do not consider performance optimization. This paper introduces a mechanism in which the resource allocation optimization problem is modeled as Markov decision processes and solved using reinforcement learning. The proposed mechanism observes its environment to learn appropriate policies which optimize resource allocation in business process execution. The experimental results indicate that the proposed approach outperforms well known heuristic or hand-coded strategies, and may improve the current state of business process management.

[1]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[2]  Jianmin Wang,et al.  A semi-automatic approach for workflow staff assignment , 2008, Comput. Ind..

[3]  Akhil Kumar,et al.  A reference model for team-enabled workflow management systems , 2001, Data Knowl. Eng..

[4]  Peter Dadam,et al.  Mining Staff Assignment Rules from Event-Based Data , 2005, Business Process Management Workshops.

[5]  Walter W. Hudson,et al.  Measurement Issues in Human Behaviour Theory , 1998 .

[6]  S. Elmaghraby Resource allocation via dynamic programming in activity networks , 1993 .

[7]  Chung-Wei Feng,et al.  The LP/IP hybrid method for construction time-cost trade-off analysis , 1996 .

[8]  Hajo A. Reijers Resource Allocation in Workflows , 2003 .

[9]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[10]  Richard W. Eglese,et al.  Simulated annealing: A tool for operational research , 1990 .

[11]  Marc Gravel,et al.  A multicriterion view of optimal resource allocation in job-shop production , 1992 .

[12]  Jean-Charles Billaut,et al.  A tabu search and a genetic algorithm for solving a bicriteria general job shop scheduling problem , 2008, Eur. J. Oper. Res..

[13]  Guangzhou Zeng,et al.  Study of genetic algorithm with reinforcement learning to solve the TSP , 2009, Expert Syst. Appl..

[14]  Wil M. P. van der Aalst,et al.  The Application of Petri Nets to Workflow Management , 1998, J. Circuits Syst. Comput..

[15]  Akhil Kumar,et al.  Dynamic Work Distribution in Workflow Management Systems: How to Balance Quality and Performance , 2002, J. Manag. Inf. Syst..

[16]  Ramaswamy Chandramouli,et al.  The Queen's Guard: A Secure Enforcement of Fine-grained Access Control In Distributed Data Analytics Platforms , 2001, ACM Trans. Inf. Syst. Secur..

[17]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[18]  Victoria Osborne,et al.  Approaches to Measuring Human Behavior in the Social Environment, edited by William R. Nugent , 2007 .

[19]  Rajarshi Das,et al.  Utility-Function-Driven Resource Allocation in Autonomic Systems , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[20]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[21]  Ayse Basar Bener,et al.  Analysis of Naive Bayes' assumptions on software fault data: An empirical study , 2009, Data Knowl. Eng..

[22]  S. S. Panwalkar,et al.  A Survey of Scheduling Rules , 1977, Oper. Res..

[23]  Hajo A. Reijers,et al.  Design and control of workflow processes: business process management for the service industry , 2003 .

[24]  Wil M. P. van der Aalst,et al.  Modelling work distribution mechanisms using Colored Petri Nets , 2007, International Journal on Software Tools for Technology Transfer.

[25]  David Vengerov,et al.  A Reinforcement Learning Approach to Dynamic Resource Allocation ∗ , 2005 .

[26]  David A. Koonce,et al.  Using data mining to find patterns in genetic algorithm solutions to a job shop schedule , 2000 .

[27]  Ran Gilad-Bachrach,et al.  Workstation capacity tuning using reinforcement learning , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[28]  Christoph Bussler,et al.  Workflow Management: Modeling Concepts, Architecture and Implementation , 1996 .

[29]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[30]  Kurt Jensen Coloured Petri Nets , 1992, EATCS Monographs in Theoretical Computer Science.

[31]  Wil M. P. van der Aalst,et al.  Analyzing Resource Behavior Using Process Mining , 2009, Business Process Management Workshops.

[32]  Stefanie Rinderle-Ma,et al.  Life-cycle support for staff assignment rules in process-aware information systems , 2007 .

[33]  Kees M. van Hee,et al.  Workflow Management: Models, Methods, and Systems , 2002, Cooperative information systems.

[34]  Jiajie Xu,et al.  Resource Allocation vs. Business Process Improvement: How They Impact on Each Other , 2008, BPM.

[35]  John Mylopoulos,et al.  Workflow Management Models , Methods , and Systems , 2002 .

[36]  E. Anderson Hudson et al. , 1977 .

[37]  Harukazu Igarashi,et al.  Robo Cup-98: Robot Soccer World Cup II , 1999 .

[38]  Wil M. P. van der Aalst,et al.  Process Aware Information Systems: Bridging People and Software Through Process Technology , 2005 .

[39]  Hiok Chai Quek,et al.  Maximum reward reinforcement learning: A non-cumulative reward criterion , 2006, Expert Syst. Appl..

[40]  Jie Tang,et al.  Modeling the evolution of associated data , 2010, Data Knowl. Eng..

[41]  van der Wmp Wil Aalst,et al.  Workflow resource patterns , 2004 .

[42]  Leslie Pack Kaelbling,et al.  The NSF Workshop on Reinforcement Learning: Summary and Observations , 1996 .

[43]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[44]  Hyun Soo Kwon,et al.  Journal of Human Behavior in the Social Environment , 2022 .

[45]  Nicholas Bambos,et al.  Adaptive data-aware utility-based scheduling in resource-constrained systems , 2010, J. Parallel Distributed Comput..

[46]  T.C.E. Cheng,et al.  Shipping and Logistics Management , 2010 .

[47]  Antonio Cerone,et al.  Verifying BPEL Workflows Under Authorisation Constraints , 2006, Business Process Management.

[48]  Wil M.P. van der Aalst,et al.  Modeling work distribution mechanisms using colored petri nets , 2005 .

[49]  Jan Mendling,et al.  Process instantiation , 2009, Data Knowl. Eng..

[50]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[51]  Kurt Jensen,et al.  Coloured Petri Nets: Basic Concepts, Analysis Methods and Practical Use. Vol. 2, Analysis Methods , 1992 .

[52]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[53]  A. J. Clewett,et al.  Introduction to sequencing and scheduling , 1974 .

[54]  Zhengxing Huang,et al.  Radiology information system: a workflow-based approach , 2009, International Journal of Computer Assisted Radiology and Surgery.

[55]  Jürgen Schmidhuber,et al.  A reinforcement learning approach for individualizing erythropoietin dosages in hemodialysis patients , 2009, Expert Syst. Appl..

[56]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[57]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[58]  Manuela M. Veloso,et al.  Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.

[59]  Andreas Schaad,et al.  Modeling of Task-Based Authorization Constraints in BPMN , 2007, BPM.

[60]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[61]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[62]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[63]  Kurt Jensen,et al.  Coloured Petri nets (2nd ed.): basic concepts, analysis methods and practical use: volume 1 , 1996 .

[64]  Kees M. van Hee,et al.  Scheduling-free resource management , 2007, Data Knowl. Eng..

[65]  Wil M. P. van der Aalst,et al.  Work Distribution and Resource Management in BPEL4People: Capabilities and Opportunities , 2008, CAiSE.