Nash-reinforcement learning (N-RL) for developing coordination strategies in non-transferable utility games

Social (central) planning is normally used in the literature to optimize the system-wide efficiency and utility of multi-operator systems. Central planning tries to maximize system's benefits by coordinating the operators' strategies and reduce the externalities, assuming that all parties are willing to cooperate. This assumption implies that operators are willing to base their decisions based on group rationality rather than individual rationality, even if increased group benefits results in reduced benefits for some agents. This assumption limits the applicability of social planner's solutions, as perfect cooperation among agents is often infeasible in real world. Recognizing the fact that decisions are normally based on individual rationality in human systems, cooperative game theory methods are normally employed to address the major limitation of social planner's methods. Game theory methods revise the social planner's solution such that not only group benefits are increased, but also there exists no agent whose cooperative gain is less than his non-cooperative gain. However, in most cases, utility is assumed to be transferrable and the literature has not sufficiently focused on non-transferrable utility games. In such games parties are willing to cooperate and coordinate their strategies to increase their benefits, but have no ability to compensate each other to promote cooperation. To a good extent, the transferrable utility assumption is due to the complexity of calculations to find the best response strategies of agents in non-cooperative and cooperative modes, especially in multi-period games. By combining Reinforcement Learning and Nash bargaining solution, this paper develops a new method for applying cooperative game theory to complex multi-period non-transferrable utility games. For illustration, the suggested method is applied to two numerical examples in which two hydropower operators seek developing a fair and efficient cooperation mechanism to increase their gains.

[1]  A. Dinar,et al.  Non-cooperative institutions for sustainable common pool resource management: Application to groundwater , 2012 .

[2]  Eric Bonabeau,et al.  Agent-based modeling: Methods and techniques for simulating human systems , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Kaveh Madani,et al.  Game theory and water resources , 2010 .

[4]  A. Dinar,et al.  Distributional considerations of international water resources under externality: The case of Ethiopia, Sudan and Egypt on the Blue Nile , 2013 .

[5]  Masoud Mahootchi Storage System Management Using Reinforcement Learning Techniques and Nonlinear Models , 2009 .

[6]  Dan Yaron,et al.  Regional Cooperation in the Use of Irrigation Water, Efficiency and Game Theory Analysis of Income Distribution , 1986 .

[7]  Keith W. Hipel,et al.  The graph model approach to environmental conflict resolution , 1988 .

[8]  Dan Yaron,et al.  Regional cooperation in the use of irrigation water: Efficiency and income distribution , 1990 .

[9]  Kaveh Madani,et al.  A game theory-reinforcement learning (GT-RL) method to develop optimal operation policies for multi-operator reservoir systems , 2014 .

[10]  Lizhong Wang,et al.  Basin-wide cooperative water resources allocation , 2008, Eur. J. Oper. Res..

[11]  Ariel Dinar,et al.  Cooperative Institutions for Sustainable Management of Common Pool Resources , 2011 .

[12]  K. Madani Hydropower licensing and climate change: Insights from cooperative game theory , 2011 .

[13]  J. Nash THE BARGAINING PROBLEM , 1950, Classics in Game Theory.

[14]  J. Nash NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.

[15]  Keith W. Hipel,et al.  Non-Cooperative Stability Definitions for Strategic Analysis of Generic Water Resources Conflicts , 2011 .

[16]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[17]  Emily M. Zechman,et al.  Integrating evolution strategies and genetic algorithms with agent-based modeling for flushing a contaminated water distribution system , 2013 .

[18]  K. Madani,et al.  Optimality versus stability in water resource allocation. , 2014, Journal of environmental management.

[19]  Abbas Afshar,et al.  Cooperative Game Theoretic Framework for Joint Resource Management in Construction , 2014 .

[20]  N. P. Nguyena,et al.  Water quality trading with asymmetric information , uncertainty and transaction costs : A stochastic agent-based simulation , 2015 .

[21]  Kaveh Madani,et al.  Developing a module for estimating climate warming effects on hydropower pricing in California , 2012 .

[22]  Marcello Restelli,et al.  Tree‐based reinforcement learning for optimal water reservoir operation , 2010 .

[23]  Najmeh Mahjouri,et al.  Optimal Inter-Basin Water Allocation Using Crisp and Fuzzy Shapley Games , 2010 .

[24]  A. Dinar,et al.  Cooperative institutions for sustainable common pool resource management: Application to groundwater , 2012 .

[25]  Jin-Hee Lee,et al.  Stochastic optimization of multireservoir systems via reinforcement learning , 2007 .

[26]  A. Castelletti,et al.  Assessing the value of cooperation and information exchange in large water resources systems by agent‐based optimization , 2013 .

[27]  Kaveh Madani,et al.  Climate change impacts on high-elevation hydroelectricity in California , 2014 .

[28]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[29]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[30]  Jay R. Lund,et al.  Modeling California's high‐elevation hydropower systems in energy units , 2009 .

[31]  J. Eheart,et al.  An agent‐based model of farmer decision‐making and water quality impacts at the watershed scale under markets for carbon allowances and a second‐generation biofuel crop , 2011 .

[32]  A. Dinar,et al.  Evaluating Cooperative Game Theory in water resources , 1992 .