Intelligent Inventory Control via Ruminative Reinforcement Learning

Inventory management is a sequential decision problem that can be solved with reinforcement learning (RL). Although RL in its conventional form does not require domain knowledge, exploiting such knowledge of problem structure, usually available in inventory management, can be beneficial to improving the learning quality and speed of RL. Ruminative reinforcement learning (RRL) has been introduced recently based on this approach. RRL is motivated by how humans contemplate the consequences of their actions in trying to learn how to make a better decision. This study further investigates the issues of RRL and proposes new RRL methods applied to inventory management. Our investigation provides insight into different RRL characteristics, and our experimental results show the viability of the new methods.

[1]  Edwin K. P. Chong,et al.  Approximate dynamic programming for an inventory problem: Empirical comparison , 2011, Comput. Ind. Eng..

[2]  Edwin K. P. Chong,et al.  Intelligent Inventory Control: Is Bootstrapping Worth Implementing? , 2012, Intelligent Information Processing.

[3]  Marcello Restelli,et al.  A multiobjective reinforcement learning approach to water resources systems operation: Pareto frontier approximation in a single run , 2013 .

[4]  TaeChoong Chung,et al.  Hessian matrix distribution for Bayesian policy gradient reinforcement learning , 2011, Inf. Sci..

[5]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[6]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[7]  Le Yi Wang,et al.  VCONF: a reinforcement learning approach to virtual machines auto-configuration , 2009, ICAC '09.

[8]  Frank L. Lewis,et al.  Reinforcement learning and optimal adaptive control: An overview and implementation examples , 2012, Annu. Rev. Control..

[9]  Hiok Chai Quek,et al.  Stock trading with cycles: A financial application of ANFIS and reinforcement learning , 2011, Expert Syst. Appl..

[10]  Tatpong Katanyukul,et al.  Ruminative Reinforcement Learning: Improve Intelligent Inventory Control by Ruminating on the Past , 2014, J. Comput..

[11]  Jun-Geol Baek,et al.  Asynchronous action-reward learning for nonstationary serial supply chain inventory control , 2007, Applied Intelligence.

[12]  Masashi Sugiyama,et al.  Active Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning , 2009, IJCAI.

[13]  Dimitris Bertsimas,et al.  A Robust Optimization Approach to Inventory Theory , 2006, Oper. Res..

[14]  Kenji Doya,et al.  Metalearning and neuromodulation , 2002, Neural Networks.

[15]  P. Dayan,et al.  Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.

[16]  G. Burt,et al.  Comparing Policy Gradient and Value Function Based Reinforcement Learning Methods in Simulated Electrical Power Trade , 2012, IEEE Transactions on Power Systems.

[17]  Jennie Si,et al.  Robust Reinforcement Learning for Heating, Ventilation, and Air Conditioning Control of Buildings , 2004 .

[18]  Pieter Abbeel,et al.  Apprenticeship learning for helicopter control , 2009, CACM.

[19]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[20]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .