Scalable resource control in large-scale computing/networking infrastructures
暂无分享,去创建一个
The rapid advances in Information Services (IS) have generated many imminent administration challenges in the Information Technology (IT) infrastructure space. Typically, an IS uses the Internet or some other communication network to become widely accessible and further requires some form of computing data center to be supported. The tremendous success and the fast-paced development of IS have thus resulted in massive infrastructural growth rates. This fact coupled with the high expectations from the much-promising next generation of IS have tightened the performance requirements and have created many critical problems in IT infrastructures. This dissertation considers some of the resource control problems that arise in the various layers of large scale computing systems and networks.
Adopting a risk mitigation approach, two security issues are initially addressed. First, the problem of controlling the recovery time from epidemic spreads (i.e. viruses and worms) by applying resources to strengthen vulnerable sites and links is analyzed for network topologies. This problem is formulated as an eigenvalue control one which allows construction of scalable distributed algorithms via convex optimization. Second, patching management of software vulnerabilities is studied in data center environments. Unlike pre-existing literature that attempts to optimize defense against an active attack, the objective, here, is to find patching policies that minimize the service disruption cost from both maintenance operations and exploitation threats using a Dynamic Programming (DP) framework.
The focus is then placed on power/cost aware management of individual switching components. Well established scheduling algorithms are redesigned to gracefully balance the trade-off between power and packet delay in input queued (IQ) switches, in a way that maintains their ability to achieve maximal throughput. Additionally, after extending long withstanding queueing results, it is proven that power/cost efficient controls need not lack this ability in general queueing/switching service structures (QSSSs) by showing that "low workload" decisions are insignificant stability-wise.
Finally, the problem of online outsourcing of execution capacity in computing and data storage services is tackled. Leveraging from the fact that such services only recently became available online and utilizing a DP formulation, the cost and risk of the supported operations is dynamically optimized. A case study in data storage services validates the findings.