Reinforcement Learning in Markovian Evolutionary Games

A high voltage, low current electric signal is placed across the electrode and workpiece in an electro erosion machining circuit in accordance with the disclosure to provide improved machining. The high voltage, low current signal may be continuous or pulsed and if pulsed, it may be initiated by intermittent low voltage, high current electro erosion machining pulses or independent therefrom. The high voltage, low current signals, if pulsed, may also be initiated prior to, at the same time as or subsequent to the low voltage, high current electro machining pulses and may be of the same or opposite polarity. If the high voltage, low current signal is pulsed, the frequency and pulse width may be varied. A capacitor may be selectively placed across the electrode and workpiece during electro erosion machining in conjunction with the high voltage, low current signal to further improve the machining characteristics of the low voltage, high current electro erosion machining signal.

[1]  G. Sell Topological dynamics and ordinary differential equations , 1971 .

[2]  A. Federgruen On N-person stochastic games by denumerable state space , 1978, Advances in Applied Probability.

[3]  R. Illner,et al.  Statistical solutions of differential equations with non-uniquely solvable Cauchy problems , 1981 .

[4]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  John C. Harsanyi,et al.  Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .

[6]  C. Watkins Learning from delayed rewards , 1989 .

[7]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[8]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[9]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[11]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[12]  V. Borkar Stochastic approximation with two time scales , 1997 .

[13]  Tilman Börgers,et al.  Learning Through Reinforcement and Replicator Dynamics , 1997 .

[14]  L. Samuelson Evolutionary Games and Equilibrium Selection , 1997 .

[15]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[16]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[17]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[18]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[19]  H. Young Individual Strategy and Social Structure , 2020 .

[20]  Richard S. Sutton,et al.  Dimensions of Reinforcement Learning , 1998 .

[21]  V. Borkar,et al.  Evolutionary games with two timescales , 1999 .

[22]  Vivek S. Borkar,et al.  Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..

[23]  Kagan Tumer,et al.  An Introduction to Collective Intelligence , 1999, ArXiv.

[24]  M. Hirsch,et al.  Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games , 1999 .

[25]  M. Benaïm,et al.  Ergodic Properties of Weak Asymptotic Pseudotrajectories for Semiflows , 2000 .

[26]  Manuela Veloso,et al.  An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .

[27]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[28]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[29]  Tucker R. Balch,et al.  Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning , 2001, ICML.

[30]  Gunes Ercal,et al.  On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.

[31]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[32]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[33]  M. Benaïm,et al.  Deterministic Approximation of Stochastic Evolution in Games , 2003 .

[34]  Michael P. Wellman,et al.  Conjectural Equilibrium in Multiagent Learning , 1998, Machine Learning.

[35]  Andrew G. Barto,et al.  Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.