Moving into the Cloud

Cloud computing is the notion of abstracting and outsourcing hardware or software resources over the Internet, often to a third party on a pay-as-you-go basis. This emerging concept is sometimes claimed to represent a completely new paradigm, with disruptive effects on our means of viewing and accessing computational resources. In this thesis, we investigate the state-of-the-art of cloud computing with the aim of providing a clear understanding of the opportunities present in cloud computing, along with knowledge about the limitations and challenges in designing systems for cloud environments. We argue that this knowledge is essential for potential adopters, yet is not readily available due to the confusion currently surrounding cloud computing. Our findings reveal that challenges associated with hosting systems in cloud environments include increased latency due to longer network distances, limited bandwidth since packets must cross the Internet to reach the cloud, as well as reduced portability because cloud environments are currently lacking standardization. Additionally, systems must be designed in a loosely coupled and fault-tolerant way to fully exploit the dynamic features of cloud computing, meaning that existing applications might require significant modification before being able to fully utilize a cloud environment. These challenges also restrict some systems from being suitable for running in the cloud. Furthermore, we have implemented a prototype in Amazon EC2 to investigate the feasibility of moving an enterprise search service to a cloud environment. We base the feasibility of our approach onmeasurements of response time, bandwidth and scalability. We conclude that our cloud-based search service is feasible due to its opportunities for implementing dynamic scaling and reducing local infrastructure in a novel fashion.

[1]  Bogdan M. Wilamowski,et al.  The Transmission Control Protocol , 2005, The Industrial Information Technology Handbook.

[2]  Dag Johansen,et al.  Oivos: Simple and Efficient Distributed Data Processing , 2008, 2008 10th IEEE International Conference on High Performance Computing and Communications.

[3]  Huan Liu,et al.  GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[4]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Roy T. Fielding,et al.  Principled design of the modern Web architecture , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[7]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[8]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[9]  Knut Magne Risvik,et al.  Multi-tier architecture for Web search engines , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).

[10]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[11]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[12]  Eric Traut Building the virtual PC , 1997 .

[13]  Bing Wu,et al.  Legacy Information Systems: Issues and Directions , 1999, IEEE Softw..

[14]  Gerald J. Popek,et al.  Formal requirements for virtualizable third generation architectures , 1974, SOSP '73.

[15]  Russ Housley,et al.  Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile , 2002, RFC.

[16]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[17]  Brian Hayes,et al.  What Is Cloud Computing? , 2019, Cloud Technologies.

[18]  Steven J. Vaughan-Nichols,et al.  New Approach to Virtualization Is a Lightweight , 2006, Computer.

[19]  Doug Johnson,et al.  Computing in the Clouds. , 2010 .

[20]  No License,et al.  Intel ® 64 and IA-32 Architectures Software Developer ’ s Manual Volume 3 A : System Programming Guide , Part 1 , 2006 .

[21]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[22]  Charles J. Testa,et al.  Human factors design criteria in man-computer interaction , 1974, ACM '74.

[23]  L. Youseff,et al.  Toward a Unified Ontology of Cloud Computing , 2008, 2008 Grid Computing Environments Workshop.

[24]  Robert J. Creasy,et al.  The Origin of the VM/370 Time-Sharing System , 1981, IBM J. Res. Dev..

[25]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[26]  Mladen A. Vouk,et al.  Cloud computing — Issues, research and implementations , 2008, ITI 2008 - 30th International Conference on Information Technology Interfaces.

[27]  Chris Rose,et al.  A Break in the Clouds: Towards a Cloud Definition , 2011 .

[28]  Marvin Theimer,et al.  Bayou: replicated database services for world-wide applications , 1996, EW 7.

[29]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[30]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[31]  Shang Zhi,et al.  A proof of the queueing formula: L=λW , 2001 .

[32]  Donald D. Chamberlin,et al.  SEQUEL: A structured English query language , 1974, SIGFIDET '74.

[33]  GhemawatSanjay,et al.  The Google file system , 2003 .

[34]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[35]  David Cooper,et al.  Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile , 2008, RFC.

[36]  William E. Johnston,et al.  Network Communication as a Service-Oriented Capability , 2006, High Performance Computing Workshop.

[37]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[38]  Marianne Shaw,et al.  Denali: Lightweight Virtual Machines for Distributed and Networked Applications , 2001 .

[39]  James E. Smith,et al.  The architecture of virtual machines , 2005, Computer.

[40]  Werner Vogels,et al.  Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[41]  A. D. Meglio,et al.  Programming the Grid with gLite , 2006 .

[42]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[43]  Peter H. Gum,et al.  System/370 Extended Architecture: Facilities for Virtual Machines , 1983, IBM J. Res. Dev..