Creating a Balanced Data Science Program

As we consider the next fifty years of computing education, a phenomenon that shows no signs of abating is the data deluge, in which commercial companies, the natural sciences, the social sciences, professional sports teams, government agencies, and other institutions are generating ever-increasing quantities of data. To address the challenges posed by the data deluge, the discipline of data science has arisen, and an increasing number of universities are offering undergraduate data science programs. Many of these programs have their origins in a computer science or a statistics department, leading to a data science curriculum that is more heavily weighted toward computing or statistics. By contrast, the data science program described in this paper is a joint endeavor between computer science and statistics that seeks to provide balanced training in both areas. Its broad goals are to produce students who (a) are well-trained in both computer science and statistics, (b) are equipped with specialized data-related skills that are not normally taught in either of those disciplines, and (c) can apply their skills to a domain area. This paper reports on the author's experiences leading the effort to create this program, which has seen good growth, received positive feedback from students, and is successfully preparing students for internships. We offer this report in the hope that it may serve as a model for other institutions considering the addition of an undergraduate data science program.

[1]  Jessen T. Havill Embracing the Liberal Arts in an Interdisciplinary Data Analytics Program , 2019, SIGCSE.

[2]  Rafael A. Irizarry,et al.  A Guide to Teaching Data Science , 2016, The American statistician.

[3]  Robert Heckman,et al.  Key Concepts for a Data Science Ethics Curriculum , 2018, SIGCSE.

[4]  Lillian N. Cassel,et al.  ACM Task Force on Data Science Education: Draft Report and Opportunity for Feedback , 2019, SIGCSE.

[5]  Ricky J. Sethi,et al.  Curriculum Guidelines for Undergraduate Programs in Data Science , 2017, 1801.06814.

[6]  Mine Çetinkaya-Rundel,et al.  Building Bridges for Data Science Education , 2019, SIGCSE.

[7]  Mohammed Guller Big Data Analytics with Spark , 2015, Apress.

[8]  Adam Loy,et al.  Supporting Data Science in the Statistics Curriculum , 2019, Journal of Statistics Education.

[9]  Olaf A. Hall-Holt,et al.  Statistics-infused Introduction to Computer Science , 2015, SIGCSE.

[10]  Clifford A. Shaffer,et al.  Reconciling the Promise and Pragmatics of Enhancing Computing Pedagogy with Data Science , 2018, SIGCSE.

[11]  Bohn Stafleu van Loghum,et al.  Online … , 2002, LOG IN.

[12]  Tian Zheng Teaching Data Science in a Statistical Curriculum: Can We Teach More by Teaching Less? , 2017 .

[13]  T. Maclennan Moneyball: The Art of Winning an Unfair Game , 2005 .

[14]  Varun Aggarwal,et al.  Introducing Data Science to School Kids , 2017, SIGCSE.

[15]  Paul Anderson,et al.  Data science as an undergraduate degree , 2014, SIGCSE '14.

[16]  Paul M. Leidig,et al.  ACM Taskforce Efforts on Computing Competencies for Undergraduate Data Science Curricula , 2020, ITiCSE.