Adaptive Critic Design of Control Policies for A Multi-Echelon Inventory System

An abstract of the dissertation of Stephen Shervais for the Doctor of Philosophy in Systems Science presented October 6, 2000. Title: Adaptive Critic Design of Control Policies For A Multi-Echelon Inventory System A common problem in business is the determination of inventory and transportation policies for a physical distribution system within a changing business environment. This dissertation addresses the process of selecting an optimal set of policies for a multi-product, multi-echelon, multi-modal physical distribution system in a nonstationary environment. The problem is highly multi-dimensional, even with a small system, and the fitness surface is quite often discontinuous, with low penalty and high penalty regions separated by no more than a single transport unit. A controller design process is presented that reliably improves on the performance of the typical fixedpolicy controllers. The design process has two basic stages. First, a Genetic Algorithm (GA) is used to perform a global search (in a static environment) to find a good initial policy to be used as a starting point by the next stage. Second, an approximate dynamic programming method, implemented by an adaptive critic method known as Dual Heuristic Programming (DHP), is used to perform local optimization and fitness-terrainfollowing in a changing environment, to yield (approximately) optimal control policies. The design process includes use of training data embodying 1/f noise. Performance of the resulting controllers for the defined context was compared with that of fixed-policy controllers -in this case, a fixed policy developed using the well-known Linear Programming (LP) method, and in addition, a fixed policy developed via a GA approach. Significantly, even the worst controller developed via the proposed method substantially outperformed both the LP and GA fixed policies. On a real-world generalization test set, the average total cost score for neural-adaptive controllers was 464, while fixed-policy controllers based on LP optimization scored costs of 2891, and GAbased fixed policy controllers scored 3477. In addition, I demonstrate the effectiveness of off-optimal, GA-developed I/O pairs as the basis for training a neural-network model of the plant (needed for the DHP process). Further, I speculate on the use of a GA as a way of testing candidate business rules for structural errors.

[1]  Masaru Nakano,et al.  Enterprise modeling and simulation platform integrating manufacturing system design and supply chain , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[2]  Edward A. Silver,et al.  OVERVIEW OF A STOCK ALLOCATION MODEL FOR A TWO-ECHELON PUSH SYSTEM HAVING IDENTICAL UNITS AT THE LOWER ECHELON , 1986 .

[3]  J. O. Spalding,et al.  TRANSPORTATION INDUSTRY TAKES THE RIGHT-OF-WAY IN THE SUPPLY CHAIN , 1998 .

[4]  Anthony G. Pipe,et al.  A hybrid adaptive heuristic critic architecture for learning in large static search spaces , 1994, Proceedings of 1994 9th IEEE International Symposium on Intelligent Control.

[5]  Henrik Jeldtoft Jensen,et al.  Self-Organized Criticality: Emergent Complex Behavior in Physical and Biological Systems , 1998 .

[6]  A. M. Geoffrion,et al.  Multicommodity Distribution System Design by Benders Decomposition , 1974 .

[7]  M. P. Biswal,et al.  Fuzzy programming approach to multiobjective solid transportation problem , 1993 .

[8]  P. J. Werbos Optimization methods for brain-like intelligent control , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[9]  Avraham Mehrez,et al.  A dynamic-programming approach to continuous-review obsolescent Inventory problems , 1997 .

[10]  Q. Cao,et al.  A three-stage simulation based approach to inventory management with discrete demand , 1996 .

[11]  Filippo Menczer,et al.  EVOLVING SENSORS IN ENVIRONMENTS OF CONTROLLED COMPLEXITY , 1994 .

[12]  Refik Güllü,et al.  Optimal allocation policies in a two-echelon inventory problem with fixed shipment costs , 1996 .

[13]  Hasan Pirkul,et al.  Production, Transportation, and Distribution Planning in a Multi-Commodity Tri-Echelon System , 1996, Transp. Sci..

[14]  Hiroyuki Tanaka,et al.  Structural control based on genetic algorithm and neural network for electric power systems , 1993, [1993] Proceedings of the Second International Forum on Applications of Neural Networks to Power Systems.

[15]  Richard Bellman,et al.  Mathematical Aspects Of Scheduling And Applications , 1982 .

[16]  Victoria L. Zhang Ordering policies for an inventory system with three supply modes , 1996 .

[17]  Douglas M. Lambert,et al.  Strategic Physical Distribution Management , 1982 .

[18]  Lawrence Davis,et al.  Genetic Algorithms and Simulated Annealing , 1987 .

[19]  Thomas E. Morton,et al.  Heuristic scheduling systems : with applications to production systems and project management , 1993 .

[20]  Chin-Teng Lin,et al.  Controlling chaos by GA-based reinforcement learning neural network , 1999, IEEE Trans. Neural Networks.

[21]  Richard D. Metters,et al.  Quantifying the bullwhip effect in supply chains , 1997 .

[22]  Eb Erik Diks,et al.  Stock allocation in general multi-echelon distribution systems with (R, S) order-up-to-policies , 1997 .

[23]  Tony R. Martinez,et al.  Robust optimization using training set evolution , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[24]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[25]  R. Saxena Multi-Item, Multi-Echelon Distribution Systems Design , 1993 .

[26]  John J. Grefenstette,et al.  How Genetic Algorithms Work: A Critical Look at Implicit Parallelism , 1989, ICGA.

[27]  Bernard Widrow,et al.  Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..

[28]  John R. Clymer Optimization of Simulated System Effectiveness Using Evolutionary Algorithms , 1999, Simul..

[29]  Andrew J. Clark Multi-echelon inventory theory — A retrospective , 1994 .

[30]  Graham K. Rand,et al.  Decision Systems for Inventory Management and Production Planning , 1979 .

[31]  Tony R. Martinez,et al.  Using Evolutionary Computation to Generate Training Set Data for Neural Networks , 1995, ICANNGA.

[32]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[33]  Awi Federgruen,et al.  An Efficient Algorithm for Computing Optimal (s, S) Policies , 1984, Oper. Res..

[34]  G. Dantzig Programming of Interdependent Activities: II Mathematical Model , 1949 .

[35]  Y. Aneja,et al.  BICRITERIA TRANSPORTATION PROBLEM , 1979 .

[36]  D. Applebaum,et al.  Stochastic partial differential equations driven by Lévy space-time white noise , 2000 .

[37]  David F. Rogers,et al.  Delivery delay variation in multi-echelon inventory problems , 1997 .

[38]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[39]  L. Schrage Optimization Modeling With LINDO , 1997 .

[40]  Tony R. Martinez,et al.  A General Evolutionary/Neural Hybrid Approach to Learning Optimization Problems , 1996 .

[41]  W. Zijm,et al.  European Journal of Operational Research Materials Coordination in Stochastic Multi-echelon Systems , 2022 .

[42]  David H. Ackley,et al.  Interactions between learning and evolution , 1991 .

[43]  Geert-Jan van Houtum,et al.  On multi-stage production/inventory systems under stochastic demand , 1994 .

[44]  Andrew J. Clark,et al.  An informal survey of multi‐echelon inventory theory , 1972 .

[45]  Donald C. Wunsch,et al.  Advanced Adaptive Critic Designs , 1996 .

[46]  Moshe Dror,et al.  Inventory/routing: Reduction from an annual to a short-period problem , 1987 .

[47]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[48]  D. Fogel Evolutionary algorithms in theory and practice , 1997, Complex..

[49]  Stuart E. Dreyfus,et al.  Applied Dynamic Programming , 1965 .

[50]  G. Rand Sequencing and Scheduling: An Introduction to the Mathematics of the Job-Shop , 1982 .

[51]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[52]  Paul J. Werbos,et al.  Neurocontrol and related techniques , 1990 .

[53]  H. Van Dyke Parunak,et al.  Characterizing the manufacturing scheduling problem , 1991 .

[54]  D. Towill Industrial dynamics modelling of supply chains , 1996 .

[55]  M. Friedman A distribution multi-echelon lot-size model , 1992 .

[56]  A. G. Lagodimos,et al.  The Robustness of Multi-echelon Service Models under Autocorrelated Demands , 1993 .

[57]  J. R. Chen,et al.  Learning Algorithms: Theory and Applications in Signal Processing, Control and Communications , 2017 .

[58]  W. Spears,et al.  On the Virtues of Parameterized Uniform Crossover , 1995 .

[59]  Kevin R. Caskey,et al.  Heterogeneous dispatching rules in job and flow shops , 1996 .

[60]  Paul J. Werbos,et al.  Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[61]  Stephen C. Graves,et al.  A Review of Production Scheduling , 1981, Oper. Res..

[62]  Chienwen Wu,et al.  Intelligent use of delayed information in the supply chain by artificial neural network , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[63]  T. T. Shannon,et al.  Application considerations for the DHP methodology , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[64]  George G. Lendaris,et al.  More on training strategies for critic and action neural networks in dual heuristic programming method , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[65]  S.-C. Oh,et al.  Testing and evaluation of a multi‐commodity multi‐modal network flow model for disaster relief management , 1997 .

[66]  H. Scarf THE OPTIMALITY OF (S,S) POLICIES IN THE DYNAMIC INVENTORY PROBLEM , 1959 .

[67]  Jongtae Rhee,et al.  Efficient inventory management in multi-echelon distribution systems , 1997 .

[68]  Robert M. Pap,et al.  Handbook of neural computing applications , 1990 .

[69]  Roger McHaney Integration of the Genetic Algorithm and Discrete-Event Computer Simulation for Decision Support , 1999, Simul..

[70]  Efraim Turban,et al.  Fundamentals of Management Science , 1977 .

[71]  Mitsuo Gen,et al.  A tutorial survey of job-shop scheduling problems using genetic algorithms—I: representation , 1996 .

[72]  Martin Zwick,et al.  Effect of Environmental Structure on Evolutionary Adaptation , 1998 .

[73]  G. Clarke,et al.  Scheduling of Vehicles from a Central Depot to a Number of Delivery Points , 1964 .

[74]  Maw-Sheng Chern,et al.  An optimal recursive method for various inventory replenishment models with increasing demand and shortages , 1997 .

[75]  J. Hough,et al.  Demand analysis and inventory control , 1975 .

[76]  George G. Lendaris,et al.  Training strategies for critic and action neural networks in dual heuristic programming method , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[77]  Mitsuo Gen,et al.  Genetic algorithms and engineering design , 1997 .

[78]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[79]  Randall P. Sadowski,et al.  Simulation with Arena , 1998 .

[80]  Kamran Moinzadeh,et al.  An Information Based Multiechelon Inventory System with Emergency Orders , 1997, Oper. Res..

[81]  R. J. Tersine Principles of inventory and materials management , 1982 .

[82]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[83]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[84]  Roberto A. Santiago,et al.  Adaptive critic designs: A case study for neurocontrol , 1995, Neural Networks.

[85]  Edward A. Silver,et al.  Operations Research in Inventory Management: A Review and Critique , 1981, Oper. Res..