论文信息 - Optimizing Wireless Systems Using Unsupervised and Reinforced-Unsupervised Deep Learning

Optimizing Wireless Systems Using Unsupervised and Reinforced-Unsupervised Deep Learning

Resource allocation and transceivers in wireless networks are usually designed by solving optimization problems subject to specific constraints, which can be formulated as variable or functional optimization. If the objective and constraint functions of a variable optimization problem can be derived, standard numerical algorithms can be applied for finding the optimal solution, which however incurs high computational cost when the dimension of the variable is high. To reduce the on-line computational complexity, learning the optimal solution as a function of the environment's status by deep neural networks (DNNs) is an effective approach. DNNs can be trained under the supervision of optimal solutions, which however, is not applicable to the scenarios without models or for functional optimization where the optimal solutions are hard to obtain. If the objective and constraint functions are unavailable, reinforcement learning can be applied to find the solution of a functional optimization problem, which is however not tailored to optimization problems in wireless networks. In this article, we introduce unsupervised and reinforced-unsupervised learning frameworks for solving both variable and functional optimization problems without the supervision of the optimal solutions. When the mathematical model of the environment is completely known and the distribution of the environment's status is known or unknown, we can invoke an unsupervised learning algorithm. When the mathematical model of the environment is incomplete, we introduce reinforced- unsupervised learning algorithms that learn the model by interacting with the environment. Our simulation results confirm the applicability of these learning frameworks by taking a user association problem as an example.

[1] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[2] Chenyang Yang,et al. Cross-Layer Optimization for Ultra-Reliable and Low-Latency Radio Access Networks , 2017, IEEE Transactions on Wireless Communications.

[3] Long Chen. FINITE ELEMENT METHOD , 2013 .

[4] Neil Genzlinger. A. and Q , 2006 .

[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[7] Tony Q. S. Quek,et al. Constrained Deep Learning for Wireless Resource Management , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[8] Chengjian Sun,et al. Model-Free Unsupervised Learning for Optimization Problems with Constraints , 2019, 2019 25th Asia-Pacific Conference on Communications (APCC).

[9] Chengjian Sun,et al. Unsupervised Deep Learning for Ultra-Reliable and Low-Latency Communications , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[10] J. Gregory,et al. Constrained optimization in the calculus of variations and optimal control theory , 1992 .

[11] N. Sidiropoulos,et al. Learning to Optimize: Training Deep Neural Networks for Interference Management , 2017, IEEE Transactions on Signal Processing.

[12] Alejandro Ribeiro,et al. Learning Optimal Resource Allocations in Wireless Systems , 2018, IEEE Transactions on Signal Processing.

[13] Li Wang,et al. Learning Radio Resource Management in RANs: Framework, Opportunities, and Challenges , 2018, IEEE Communications Magazine.

[14] Chenyang Yang,et al. Learning to Optimize with Unsupervised Learning: Training Deep Neural Networks for URLLC , 2019, 2019 IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).

[15] Fei Liang,et al. Power Control for Interference Management via Ensembling Deep Neural Networks , 2019, 2019 IEEE/CIC International Conference on Communications in China (ICCC).

[16] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.