Constrained Deep Reinforcement Based Functional Split Optimization in Virtualized RANs

Virtualized Radio Access Network (vRAN) brings agility to Next-Generation RAN through functional split. It allows decomposing the base station (BS) functions into virtualized components and hosts it either at the distributed-unit (DU) or central-unit (CU). However, deciding which functions to deploy at DU or CU to minimize the total network cost is challenging. In this paper, a constrained deep reinforcement based functional split optimization (CDRS) is proposed to optimize the locations of functions in vRAN. Our formulation results in a combinatorial and NP-hard problem for which finding the exact solution is computationally expensive. Hence, in our proposed approach, a policy gradient method with Lagrangian relaxation is applied that uses a penalty signal to lead the policy toward constraint satisfaction. It utilizes a neural network architecture formed by an encoder-decoder sequenceto-sequence model based on stacked Long Short-term Memory (LSTM) networks to approximate the policy. Greedy decoding and temperature sampling methods are also leveraged for a search strategy to infer the best solution among candidates from multiple trained models that help to avoid a severe suboptimality. Simulations are performed to evaluate the performance of the proposed solution in both synthetic and real network datasets. Our findings reveal that CDRS successfully learns the optimal decision, solves the problem with the accuracy of 0.05% optimality gap and becomes the most costeffective compared to the available RAN setups. Moreover, it is seen that altering the routing cost and traffic load does not significantly degrade the optimality performance. The results also show that all of our CDRS settings have faster computational time than the optimal baseline solver. Our proposed method fills the gap of optimizing the functional split offering a near-optimal solution, faster computational time and minimal hand-engineering. Index Terms Virtualized RANs, Functional Split, Optimization, Neural Network, Deep Reinforcement Learning A preliminary version of this work appears in IEEE ICC 2021 Workshop [1]. This work was supported by the Academy of Finland 6Genesis Flagship (grant no. 318927). 1 ar X iv :2 10 6. 00 01 1v 1 [ cs .N I] 3 1 M ay 2 02 1

[1]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[2]  A. Banchs,et al.  vrAIn: Deep Learning Based Orchestration for Computing and Radio Resources in vRANs , 2020, IEEE Transactions on Mobile Computing.

[3]  Aleksandra Checko,et al.  A Survey of the Functional Splits Proposed for 5G Mobile Crosshaul Networks , 2019, IEEE Communications Surveys & Tutorials.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  George Iosifidis,et al.  FluidRAN: Optimized vRAN/MEC Orchestration , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[6]  Fidel Liberal,et al.  Virtual Network Function Placement Optimization With Deep Reinforcement Learning , 2020, IEEE Journal on Selected Areas in Communications.

[7]  Antti Ylä-Jääski,et al.  Machine Learning Meets Communication Networks: Current Trends and Future Challenges , 2020, IEEE Access.

[8]  Yoshua Bengio,et al.  Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..

[9]  BERNARD M. WAXMAN,et al.  Routing of multipoint connections , 1988, IEEE J. Sel. Areas Commun..

[10]  Marco Pavone,et al.  Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..

[11]  Gerhard Fettweis,et al.  Are Heterogeneous Cloud-Based Radio Access Networks Cost Effective? , 2015, IEEE Journal on Selected Areas in Communications.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Himank Gupta,et al.  Apt-RAN: A Flexible Split-Based 5G RAN to Minimize Energy Consumption and Handovers , 2020, IEEE Transactions on Network and Service Management.

[14]  Matti Latva-aho,et al.  Deep Reinforcement Based Optimization of Function Splitting in Virtualized Radio Access Networks , 2021, 2021 IEEE International Conference on Communications Workshops (ICC Workshops).

[15]  Shie Mannor,et al.  Reward Constrained Policy Optimization , 2018, ICLR.

[16]  Lawrence V. Snyder,et al.  Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.

[17]  Michal Pióro,et al.  SNDlib 1.0—Survivable Network Design Library , 2010, Networks.

[18]  George Iosifidis,et al.  An Optimal Deployment Framework for Multi-Cloud Virtualized Radio Access Networks , 2021, IEEE Transactions on Wireless Communications.

[19]  Yuan Zhang,et al.  Neural Combinatorial Optimization for Energy-Efficient Offloading in Mobile Edge Computing , 2020, IEEE Access.

[20]  Rami Langar,et al.  Deep Learning based User Slice Allocation in 5G Radio Access Networks , 2020, 2020 IEEE 45th Conference on Local Computer Networks (LCN).

[21]  Kleber Vieira Cardoso,et al.  PlaceRAN: Optimal Placement of Virtualized Network Functions in the Next-generation Radio Access Networks , 2021, 2102.13192.

[22]  George Iosifidis,et al.  Joint Optimization of Edge Computing Architectures and Radio Access Networks , 2018, IEEE Journal on Selected Areas in Communications.

[23]  Mohammad Sohel Rahman,et al.  Solving the Multidimensional Multiple-choice Knapsack Problem by constructing convex hulls , 2006, Comput. Oper. Res..

[24]  Andres Garcia-Saavedra,et al.  WizHaul: On the Centralization Degree of Cloud RAN Next Generation Fronthaul , 2018, IEEE Transactions on Mobile Computing.

[25]  Andres Garcia-Saavedra,et al.  5G-Crosshaul: An SDN/NFV Integrated Fronthaul/Backhaul Transport Network Architecture , 2017, IEEE Wireless Communications.

[26]  R. M. A. P. Rajatheva,et al.  6G White Paper on Machine Learning in Wireless Communication Networks , 2020, ArXiv.

[27]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[28]  Josu Ceberio,et al.  Constrained Combinatorial Optimization with Reinforcement Learning , 2020, ArXiv.

[29]  George Iosifidis,et al.  On the Optimization of Multi-Cloud Virtualized Radio Access Networks , 2020, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).

[30]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[31]  Samy Bengio,et al.  Device Placement Optimization with Reinforcement Learning , 2017, ICML.

[32]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .