Order Acceptance with Reinforcement Learning

Order Acceptance (OA) is one of the main functions in a business control framework. Basically, OA involves for each order a 0/1 (i.e., reject/accept) decision. Always accepting an order when capacity is available could unable the system to accept more convenient orders in the future. Another important aspect is the aV'(tiiability of information to the decisionmaker. We use a stochastic modeling approach using Markov decision theory and learning methods from Artificial Intelligence techniques in order to deal with uncertainty and long-term decisions in Ok Reinforcement Learning (RL) is a quite new approach that already combines this idea of modeling and solution method. Here we report on RL-solutions for some OA models.

[1]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[2]  Whm Henk Zijm,et al.  Order acceptance strategies in a production-to-order environment with setup times and due-dates , 1992 .

[3]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[4]  Richard S. Sutton,et al.  Open Theoretical Questions in Reinforcement Learning , 1999, EuroCOLT.

[5]  Eric R. Zieyel Operations research : applications and algorithms , 1988 .

[6]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[7]  Michael Kearns,et al.  Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.

[8]  Claude-Nicolas Fiechter,et al.  Efficient reinforcement learning , 1994, COLT '94.

[9]  Willem M. Nawijn The Optimal Look-Ahead Policy for Admission to a Single Server System , 1985, Oper. Res..

[10]  Benjamin Van Roy Learning and value function approximation in complex decision processes , 1998 .

[11]  Whm Wenny Raaymakers,et al.  Order acceptance and capacity loading in batch process industries , 1999 .

[12]  Prasad Tadepalli,et al.  Model-Based Average Reward Reinforcement Learning , 1998, Artif. Intell..

[13]  X. Chao,et al.  Operations scheduling with applications in manufacturing and services , 1999 .

[14]  Ronald De Boer Resource-constrained multi-project management, a hierarchical decision support system , 1998 .

[15]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[16]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[17]  D. Sofge THE ROLE OF EXPLORATION IN LEARNING CONTROL , 1992 .

[18]  Marco Wiering,et al.  Explorations in efficient reinforcement learning , 1999 .

[19]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[20]  Dimitri P. Bertsekas,et al.  Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.

[21]  Marc J. F. Wouters Relevant cost information for order acceptance decisions , 1997 .