Stochastic Dynamic Production Control by Neurodynamic Programming

The paper proposes Markov Decision Processes (MDPs) to model production control systems that work in uncertain and changing environments. In an MDP finding an optimal control policy can be traced back to computing the optimal value function, which is the unique solution of the Bellman equation. Reinforcement learning methods, such as Q-learning, can be used for estimating this function; however, the value estimations are often only available for a few states of the environment, typically generated by simulation. The paper suggests the application of a new type of support vector regression model, called ν-SVR, which can effectively fit a smooth function to the available data and allow good generalization properties. The effectiveness of the approach is shown by experimental results on both benchmark and industry related data.

[1]  László Monostori,et al.  Organizing and running real-time, co-operative enterprises , 2005 .

[2]  Johann L. Hurink,et al.  Tabu search for the job-shop scheduling problem with multi-purpose machines , 1994 .

[3]  Han Hoogeveen,et al.  Short Shop Schedules , 1997, Oper. Res..

[4]  Botond Kádár,et al.  Adaptation and Learning in Distributed Production Control , 2004 .

[5]  László Monostori,et al.  Emergent synthesis methodologies for manufacturing , 2001 .

[6]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[7]  László Monostori,et al.  AI and machine learning techniques for managing complexity, changes and uncertainties in manufacturing , 2003 .

[8]  Spyros A. Reveliotis,et al.  Relative value function approximation for the capacitated re-entrant line scheduling problem , 2004, IEEE Transactions on Automation Science and Engineering.

[9]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[10]  László Monostori,et al.  Machine Learning Approaches to Manufacturing , 1996 .

[11]  Itsuo Hatono,et al.  Emergent Synthesis Approaches to Control and Planning in Make to Order Manufacturing Environments , 2004 .

[12]  Mario Martín,et al.  On-Line Support Vector Machine Regression , 2002, ECML.

[13]  Xin Wang,et al.  Batch Value Function Approximation via Support Vectors , 2001, NIPS.

[14]  Nobutada Fujii,et al.  Reinforcement Learning Approaches to Biological Manufacturing Systems , 2000 .

[15]  Barbara Hammer,et al.  Improving iterative repair strategies for scheduling with the SVM , 2003, ESANN.

[16]  Paul Valckenaers,et al.  Holonic Manufacturing Execution Systems , 2005 .