论文信息 - A Novel Dynamic Spectrum Allocation Algorithm Based on POMDP Reinforcement Learning

A Novel Dynamic Spectrum Allocation Algorithm Based on POMDP Reinforcement Learning

A game model based on Vickrey-Clarke-Groves(VCG) mechanism for dynamic spectrum allocation is presented,to solve the complexity problem of the dynamic spectrum allocation and reduce information exchange during the dynamic spectrum allocation.Further,a partially observable Markov decision processes(POMDP) reinforcement learning algorithm is presented.Through the observation and statistics of historical information,the secondary users enhance the reward value of bidding strategy by continuous learning,so as to obtain the optimal bidding strategy.Finally,the POMDP reinforcement learning algorithm is transformed into optimal strategy learning algorithm of belief Markov decision processes(MDP),which is solved by using the value iteration algorithm.The simulation results reveales that the POMDP reinforcement learning algorithm can significantly improve the performance of dynamic spectrum allocation.

Zeng Xiao-ping