Guarding Fast Data Delivery in Cloud: An Effective Approach to Isolating Performance Bottleneck During Slow Data Delivery

Cloud-based products heavily rely on the fast data delivery between data centers and remote users - when data delivery is slow, the products' performance is crippled. When slow data delivery occurs, engineers need to investigate and root cause the issue. To facilitate the investigations, we propose an algorithm to automatically identify the performance bottleneck. The algorithm aggregates information from multiple layers of data sender and receiver. It helps to automatically isolate the problem type by identifying which component of sender/receiver/network is the bottleneck. After isolation, successive efforts can be taken to root cause the exact problem. We also build a prototype to demonstrate the effectiveness of the algorithm.