An integrated, modular approach to data science education in microbiology

We live in an increasingly data-driven world, where high-throughput sequencing and mass spectrometry platforms are transforming biology into an information science. This has shifted major challenges in biological research from data generation and processing to interpretation and knowledge translation. However, postsecondary training in bioinformatics, or more generally data science for life scientists, lags behind current demand. In particular, development of accessible, undergraduate data science curricula has the potential to improve research and learning outcomes as well as better prepare students in the life sciences to thrive in public and private sector careers. Here, we describe the Experiential Data science for Undergraduate Cross-Disciplinary Education (EDUCE) initiative, which aims to progressively build data science competency across several years of integrated practice. Through EDUCE, students complete data science modules integrated into required and elective courses augmented with coordinated cocurricular activities. The EDUCE initiative draws on a community of practice consisting of teaching assistants (TAs), postdocs, instructors, and research faculty from multiple disciplines to overcome several reported barriers to data science for life scientists, including instructor capacity, student prior knowledge, and relevance to discipline-specific problems. Preliminary survey results indicate that even a single module improves student self-reported interest and/or experience in bioinformatics and computer science. Thus, EDUCE provides a flexible and extensible active learning framework for integration of data science curriculum into undergraduate courses and programs across the life sciences.

[1]  L. Vygotsky Mind in Society: The Development of Higher Psychological Processes: Harvard University Press , 1978 .

[2]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[3]  Michelle D. Brazas,et al.  A global perspective on evolving bioinformatics and data science training needs , 2017, Briefings Bioinform..

[4]  Ann L. Brown,et al.  How people learn: Brain, mind, experience, and school. , 1999 .

[5]  William Snyder,et al.  Cultivating Communities of Practice: A Guide to Managing Knowledge , 2002 .

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  M. Maclean,et al.  Swift action needed to close the skills gap in bioinformatics , 1999, Nature.

[8]  Michelle K. Smith,et al.  Active learning increases student performance in science, engineering, and mathematics , 2014, Proceedings of the National Academy of Sciences.

[9]  Martijn P. F. Berger,et al.  The effect of distributed practice on students’ conceptual understanding of statistics , 2011 .

[10]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[11]  Sanford Weisberg,et al.  An R Companion to Applied Regression , 2010 .

[12]  Telecommunications Board,et al.  Data Science for Undergraduates , 2018 .

[13]  Tracy K. Teal,et al.  A vision for collaborative training infrastructure for bioinformatics , 2017, Annals of the New York Academy of Sciences.

[14]  L. S. Vygotskiĭ,et al.  Mind in society : the development of higher psychological processes , 1978 .

[15]  John D. Bransford,et al.  Levels of processing versus transfer appropriate processing , 1977 .

[16]  David F. Feldon,et al.  Null effects of boot camps and short-format training for PhD students in life sciences , 2017, Proceedings of the National Academy of Sciences.

[17]  Francis Jones,et al.  Benefits and Drawbacks of Using Multiple Instructors to Teach Single Courses , 2012 .

[18]  Cayelan C Carey,et al.  Power, pitfalls, and potential for integrating computational literacy into undergraduate ecology courses , 2018, Ecology and evolution.

[19]  S. Tringe,et al.  A compendium of multi-omic sequence information from the Saanich Inlet water column , 2017, Scientific Data.

[20]  S. Hallam,et al.  A compendium of geochemical information from the Saanich Inlet water column , 2017, Scientific Data.

[21]  James L. Bess Teaching Alone, Teaching Together: Transforming the Structure of Teams for Teaching. The Jossey-Bass Higher and Adult Education Series. , 2000 .

[22]  Johanna Hardin,et al.  Teaching the Next Generation of Statistics Students to “Think With Data”: Special Issue on Statistics and the Undergraduate Curriculum , 2015 .

[23]  Niels W. Hanson,et al.  The information science of microbial ecology. , 2016, Current opinion in microbiology.

[24]  John Howlett Progressive Education: A Critical Introduction , 2013 .