Introducing Cloud Computing Topics in Curricula

The demand for graduates with exposure in Cloud Computing is on the rise. For many educational institutions, the challenge is to decide on how to incorporate appropriate cloud-based technologies into their curricula. In this paper, we describe our design and experiences of integrating Cloud Computing components into seven third/fourth-year undergraduate-level information system, computer science, and general science courses that are related to large-scale data processing and analysis at the University of Queensland, Australia. For each course, we aimed at finding the best-available and cost-effective cloud technologies that fit well in the existing curriculum. The cloud related technologies discussed in this paper include opensource distributed computing tools such as Hadoop, Mahout, and Hive, as well as cloud services such as Windows Azure and Amazon Elastic Computing Cloud (EC2). We anticipate that our experiences will prove useful and of interest to fellow academics wanting to introduce Cloud Computing modules to existing courses.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[4]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[5]  Xinli Wang,et al.  Introducing cloud computing with a senior design project in undergraduate education of computer system and network administration , 2011, SIGITE '11.

[6]  Valerio Pascucci,et al.  Parallel visualization on large clusters using MapReduce , 2011, 2011 IEEE Symposium on Large Data Analysis and Visualization.

[7]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[8]  Shantenu Jha,et al.  Using the TeraGrid to teach scientific computing , 2011 .

[9]  冯海超 Windows Azure:微软押上未来 , 2012 .

[10]  Kwan-Liu Ma,et al.  An Interface Design for Future Cloud-Based Visualization Services , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[11]  Kamalrulnizam Abu Bakar,et al.  Visualization Pipeline for Medical Datasets on Grid Computing Environment , 2007, 2007 International Conference on Computational Science and its Applications (ICCSA 2007).

[12]  Richard A. Brown,et al.  Hadoop at home: large-scale computing at a small college , 2009, SIGCSE '09.

[13]  GhemawatSanjay,et al.  The Google file system , 2003 .

[14]  Ralf Lämmel,et al.  Google's MapReduce programming model - Revisited , 2007, Sci. Comput. Program..

[15]  Jenine Beekhuyzen,et al.  Journal of Information Systems Education , 2005 .

[16]  Luis M. Vaquero EduCloud: PaaS versus IaaS Cloud Usage for an Advanced Computer Science Course , 2011, IEEE Transactions on Education.

[17]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[18]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[19]  Randy H. Katz,et al.  Experiences teaching MapReduce in the cloud , 2012, SIGCSE '12.

[20]  Zainul Ahmad Rajion,et al.  Visualization Pipeline for Medical Datasets on Grid Computing Environment , 2007 .

[21]  Richard A. Brown,et al.  Virtual clusters for parallel and distributed education , 2012, SIGCSE '12.

[22]  Yao Sun,et al.  HBase, MapReduce, and Integrated Data Visualization for Processing Clinical Signal Data , 2011, AAAI Spring Symposium: Computational Physiology.