Coded Computation Against Processing Delays for Virtualized Cloud-Based Channel Decoding

The uplink of a cloud radio access network architecture is studied in which decoding at the cloud takes place via network function virtualization on commercial off-the-shelf servers. In order to mitigate the impact of straggling decoders in this platform, a novel coding strategy is proposed, whereby the cloud re-encodes the received frames via a linear code before distributing them to the decoding processors. Transmission of a single frame is considered first, and upper bounds on the resulting frame unavailability probability as a function of the decoding latency are derived by assuming a binary symmetric channel for uplink communications. Then, the analysis is extended to account for random frame arrival times. In this case, the tradeoff between an average decoding latency and the frame error rate is studied for two different queuing policies, whereby the servers carry out per-frame decoding or continuous decoding, respectively. Numerical examples demonstrate that the bounds are useful tools for code design and that coding is instrumental in obtaining a desirable compromise between decoding latency and reliability.

[1]  Malhar Chaudhari,et al.  Rateless codes for near-perfect load balancing in distributed matrix-vector multiplication , 2018, Proc. ACM Meas. Anal. Comput. Syst..

[2]  R. L. Brooks On colouring the nodes of a network , 1941, Mathematical Proceedings of the Cambridge Philosophical Society.

[3]  Peter Rost,et al.  Opportunistic Hybrid ARQ—Enabler of Centralized-RAN Over Nonideal Backhaul , 2014, IEEE Wireless Communications Letters.

[4]  Kannan Ramchandran,et al.  Speeding Up Distributed Machine Learning Using Codes , 2015, IEEE Transactions on Information Theory.

[5]  Abddn SANCHEZ-ARROYO,et al.  Determining the total colouring number is np-hard , 1989, Discret. Math..

[6]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[7]  Frank Schaich,et al.  Quantitative analysis of split base station processing and determination of advantageous architectures for LTE , 2013, Bell Labs Technical Journal.

[8]  Osvaldo Simeone,et al.  Coded Network Function Virtualization: Fault Tolerance via In-Network Coding , 2016, IEEE Wireless Communications Letters.

[9]  Filip De Turck,et al.  Network Function Virtualization: State-of-the-Art and Research Challenges , 2015, IEEE Communications Surveys & Tutorials.

[10]  Casey A. Volino,et al.  A First Course in Stochastic Models , 2005, Technometrics.

[11]  Scott Shenker,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .

[12]  Emina Soljanin,et al.  Effective Straggler Mitigation: Which Clones Should Attack and When? , 2017, PERV.

[13]  Mohammad Ali Maddah-Ali,et al.  Coded Distributed Computing: Straggling Servers and Multistage Dataflows , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Osvaldo Simeone,et al.  Uplink HARQ for Cloud RAN via Separation of Control and Data Planes , 2017, IEEE Transactions on Vehicular Technology.

[15]  Emina Soljanin,et al.  On the Delay-Storage Trade-Off in Content Download from Coded Distributed Storage Systems , 2013, IEEE Journal on Selected Areas in Communications.

[16]  Mohammad Ali Maddah-Ali,et al.  Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication , 2017, NIPS.

[17]  Shivaram Venkataraman,et al.  Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation , 2018, ArXiv.

[18]  Soummya Kar,et al.  Computing Linear Transformations With Unreliable Components , 2015, IEEE Transactions on Information Theory.

[19]  Pulkit Grover,et al.  “Short-Dot”: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products , 2017, IEEE Transactions on Information Theory.

[20]  Jörg Kliewer,et al.  Coded Computation Against Straggling Decoders for Network Function Virtualization , 2017, 2018 IEEE International Symposium on Information Theory (ISIT).

[21]  A. Salman Avestimehr,et al.  A Fundamental Tradeoff Between Computation and Communication in Distributed Computing , 2016, IEEE Transactions on Information Theory.

[22]  Christian Bonnet,et al.  Demo: OpenAirInterface: an open LTE network in a PC , 2014, MobiCom.

[23]  Fabrice Guillemin,et al.  Towards the deployment of a fully centralized Cloud-RAN architecture , 2017, 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC).

[24]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[25]  Emina Soljanin,et al.  Efficient Redundancy Techniques for Latency Reduction in Cloud Systems , 2015, ACM Trans. Model. Perform. Evaluation Comput. Syst..

[26]  Soummya Kar,et al.  Coded Iterative Computing using Substitute Decoding , 2018, ArXiv.

[27]  Gregory W. Wornell,et al.  Using Straggler Replication to Reduce Latency in Large-scale Parallel Computing , 2015, PERV.

[28]  Svante Janson,et al.  Large deviations for sums of partly dependent random variables , 2004, Random Struct. Algorithms.

[29]  Mohammad Ali Maddah-Ali,et al.  A Unified Coding Framework for Distributed Computing with Straggling Servers , 2016, 2016 IEEE Globecom Workshops (GC Wkshps).

[30]  Amir Salman Avestimehr,et al.  Coded computation over heterogeneous clusters , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[31]  Albin Severinson,et al.  Block-diagonal coding for distributed computing with straggling servers , 2017, 2017 IEEE Information Theory Workshop (ITW).

[32]  Navid Nikaein,et al.  Processing Radio Access Network Functions in the Cloud: Critical Issues and Modeling , 2015, MCS '15.

[33]  Nei Kato,et al.  Reliability evaluation for NFV deployment of future mobile broadband networks , 2016, IEEE Wireless Communications.

[34]  Mohammad Ali Maddah-Ali,et al.  Coded MapReduce , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[35]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[36]  H. Vincent Poor,et al.  Channel Coding Rate in the Finite Blocklength Regime , 2010, IEEE Transactions on Information Theory.

[37]  Sheldon M. Ross,et al.  Introduction to probability models , 1975 .

[38]  Juan Felipe Botero,et al.  Resource Allocation in NFV: A Comprehensive Survey , 2016, IEEE Transactions on Network and Service Management.

[39]  Navid Nikaein,et al.  Critical issues of centralized and cloudified LTE-FDD Radio Access Networks , 2015, 2015 IEEE International Conference on Communications (ICC).

[40]  Fabrice Guillemin,et al.  Cloud-RAN Modeling Based on Parallel Processing , 2018, IEEE Journal on Selected Areas in Communications.

[41]  Joonhyuk Kang,et al.  On the Trade-Off Between Computational Load and Reliability for Network Function Virtualization , 2017, IEEE Communications Letters.