Deep Learning-Based Data Storage for Low Latency in Data Center Networks

Low-latency data access is becoming an upcoming and increasingly important challenge. The proper placement of data blocks can reduce data travel among distributed storage systems, which contributes significantly to the latency reduction. However, the dominant data placement optimization has primarily relied on prior known data requests or static initial data distribution, which ignores the dynamics of clients’ data access requests and networks. The learning technology can help the data center networks (DCNs) learn from historical access information and make optimal data storage decision. Consider a more practical DCNs with fat-tree topology, we utilize a deep-learning technology <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-means to help store data blocks and then improve the read and write latency of the DCN, where <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> is the number of cores in the fat-tree. The evaluation results demonstrate that the average write and read latency of the whole system can be lowered by 33% and 45%, respectively. And the best set of parameter <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> is analyzed and recommended to provide guidance to the real application, which is equal to the number of cores in the DCNs.

[1]  Chaitanya Swamy,et al.  Approximation Algorithms for Data Placement Problems , 2008, SIAM J. Comput..

[2]  Danny H. K. Tsang,et al.  SLA guaranteed virtual machine consolidation for computing clouds , 2012, 2012 IEEE International Conference on Communications (ICC).

[3]  Yuanyuan Tian,et al.  CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop , 2011, Proc. VLDB Endow..

[4]  Marios Hadjieleftheriou,et al.  Distributed data placement to minimize communication costs via graph partitioning , 2014, SSDBM '14.

[5]  Chaochao Feng,et al.  Mobile relay deployment in multihop relay networks , 2017, Comput. Commun..

[6]  Xuejie Zhang,et al.  Machine Learning Based Resource Allocation of Cloud Computing in Auction , 2018 .

[7]  Jun Li,et al.  ArA: Adaptive resource allocation for cloud computing environments under bursty workloads , 2011, 30th IEEE International Performance Computing and Communications Conference.

[8]  Yun Lin,et al.  Semi-Supervised Learning with Generative Adversarial Networks on Digital Signal Modulation Classification , 2018 .

[9]  Alec Wolman,et al.  Volley: Automated Data Placement for Geo-Distributed Cloud Services , 2010, NSDI.

[10]  Jin Wang,et al.  A PSO based Energy Efficient Coverage Control Algorithm for Wireless Sensor Networks , 2018 .

[11]  Lei Ying,et al.  MapTask Scheduling in MapReduce With Data Locality: Throughput and Heavy-Traffic Optimality , 2013, IEEE/ACM Transactions on Networking.

[12]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[13]  Samy Bengio,et al.  Device Placement Optimization with Reinforcement Learning , 2017, ICML.

[14]  Albert Y. Zomaya,et al.  Composition-Driven IoT Service Provisioning in Distributed Edges , 2018, IEEE Access.

[15]  Junzhou Luo,et al.  Data Placement and Task Scheduling Optimization for Data Intensive Scientific Workflow in Multiple Data Centers Environment , 2014 .

[16]  Jianliang Xu,et al.  On replica placement for QoS-aware content distribution , 2004, IEEE INFOCOM 2004.

[17]  Adam Wierman,et al.  Datum: Managing Data Purchasing and Data Placement in a Geo-Distributed Data Market , 2018, IEEE/ACM Transactions on Networking.

[18]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[19]  R. Sherratt,et al.  Adversarial learning for distant supervised relation extraction , 2018 .

[20]  Vijay K. Gurbani,et al.  Network-aware service placement in a distributed cloud environment , 2012, SIGCOMM '12.

[21]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[22]  S. Haykin,et al.  A Q-learning-based dynamic channel assignment technique for mobile communication systems , 1999 .

[23]  Jiannong Cao,et al.  Minimizing Movement for Target Coverage and Network Connectivity in Mobile Sensor Networks , 2015, IEEE Transactions on Parallel and Distributed Systems.

[24]  Srikanth Kandula,et al.  Leveraging endpoint flexibility in data-intensive clusters , 2013, SIGCOMM.

[25]  S. Brandt,et al.  Data Placement for Copy-on-write Using Virtual Contiguity Contents , 2002 .

[26]  Jianxin Wang,et al.  Adjusting Packet Size to Mitigate TCP Incast in Data Center Networks with COTS Switches , 2020, IEEE Transactions on Cloud Computing.

[27]  Ishai Menache,et al.  Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can , 2015, Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication.

[28]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[29]  Ana Galindo-Serrano,et al.  Distributed Q-Learning for Aggregated Interference Control in Cognitive Radio Networks , 2010, IEEE Transactions on Vehicular Technology.

[30]  Jianping Pan,et al.  A Framework of Hypergraph-Based Data Placement Among Geo-Distributed Datacenters , 2020, IEEE Transactions on Services Computing.

[31]  George Varghese,et al.  CONGA: distributed congestion-aware load balancing for datacenters , 2015, SIGCOMM.

[32]  Jong Hyuk Park,et al.  An improved ant colony optimization-based approach with mobile sink for wireless sensor networks , 2017, The Journal of Supercomputing.