How can ignorant but patient cognitive terminals learn their strategy and utility?

This paper aims to contribute to bridge the gap between existing theoretical results in distributed radio resource allocation policies based on equilibria in games (assuming complete information and rational players) and practical design of signal processing algorithms for self-configuring wireless networks. For this purpose, the framework of learning theory in games is exploited. Here, a new learning algorithm based on mild information assumptions at the transmitters is presented. This algorithm possesses attractive convergence properties not available for standard reinforcement learning algorithms and in addition, it allows each transmitter to learn both its optimal strategy and the values of its expected utility for all its actions. A detailed convergence analysis is conducted. In particular, a framework for studying heterogeneous wireless networks where transmitters do not learn at the same rate is provided. The proposed algorithm, which can be applied to any wireless network verifying the information assumptions stated, is applied to the case of multiple access channels in order to provide some numerical results.

[1]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[2]  Eric Maskin,et al.  Nash Equilibrium and Mechanism Design , 2008 .

[3]  R. Chandramouli,et al.  Stochastic Learning Solution for Distributed Discrete Power Control Game in Wireless Data Networks , 2008, IEEE/ACM Transactions on Networking.

[4]  David J. Goodman,et al.  Power control for wireless data , 2000, IEEE Wirel. Commun..

[5]  S. Lasaulce,et al.  Methodologies for analyzing equilibria in wireless games , 2009, IEEE Signal Process. Mag..

[6]  Mérouane Debbah,et al.  On the base station selection and base station sharing in self-configuring networks , 2009, VALUETOOLS.

[7]  V. V. Phansalkar,et al.  Decentralized Learning of Nash Equilibria in Multi-Person Stochastic Games With Incomplete Information , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[8]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  P. Taylor,et al.  Evolutionarily Stable Strategies and Game Dynamics , 1978 .

[11]  J. Gibbs On the equilibrium of heterogeneous substances , 1878, American Journal of Science and Arts.

[12]  Tim Roughgarden,et al.  Algorithmic Game Theory , 2007 .

[13]  Dirk Ifenthaler,et al.  Stochastic Models of Learning , 2012 .

[14]  Mérouane Debbah,et al.  Power allocation games for mimo multiple access channels with coordination , 2009, IEEE Transactions on Wireless Communications.

[15]  V. Borkar Stochastic approximation with two time scales , 1997 .

[16]  Sergio Barbarossa,et al.  Potential Games: A Framework for Vector Power Control Problems With Coupled Constraints , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.