Identifying Topical Coverages of Curricula using Topic Modeling and Visualization Techniques: A Case of Digital and Data Curation

Digital/data curation curricula have been around for a couple of decades. Currently, several ALA-accredited LIS programs offer digital/data curation courses and certifcate programs to address the high demand for professionals with the knowledge and skills to handle digital content and research data in an ever-changing information environment. In this study, we aimed to examine the topical scopes of digital/data curation curricula in the context of the LIS feld, using a semi-automatic approach. We collected 16 syllabi from the digital/data curation courses, as well as textual descriptions of the 11 programs and their core courses offered in the U.S., Canada, and the U.K. The collected data were analyzed using a probabilistic topic modeling technique, Latent Dirichlet Allocation, to identify both common and unique topics. The results are the identifcation of 20 topics both at the programand course-levels. Comparison between the programand course-level topics uncovered a set of unique topics, and a number of common topics. Furthermore, we provide interactive visualizations for digital/data curation programs and courses for further analysis of topical distributions. We believe that our combined approach of a topic modeling and visualizations may provide insight for identifying emerging trends and co-occurrences of topics among digital/data curation curricula in the LIS feld. Received 09 August 2018 ~ Revision received 21 November ~ Accepted 27 January 2019 Correspondence should be addressed to to Seungwon Yang, School of Library & Information Science and Center for Computation and Technology, Louisiana State University, 267 Coates Hall, Baton Rouge, LA 70803. E-mail: seungwonyang@lsu.edu The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution Licence, version 4.0. For details please see https://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2019, Vol. 14, Iss. 1, 62–87 62 http://dx.doi.org/10.2218/ijdc.v14i1.586 DOI: 10.2218/ijdc.v14i1.586 doi:10.2218/ijdc.v14i1.586 Yang, Ju and Chung | 63

[1]  Andrew McCallum,et al.  Database of NIH grants using machine-learned categories and graphical clustering , 2011, Nature Methods.

[2]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[3]  Sarah Higgins,et al.  Digital Curation: The Emergence of a New Discipline , 2011, Int. J. Digit. Curation.

[4]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Yan Quan Liu,et al.  Is data curation education at library and information science schools in North America adequate , 2012 .

[7]  Helen R. Tibbo,et al.  Defining what digital curators do and what they need to know: the digccurr project , 2007, JCDL '07.

[8]  Jeannette Allis Bastian,et al.  Out of the classroom and into the laboratory: Teaching digital curation virtually and experientially , 2012 .

[9]  Helen Hockx-Yu,et al.  Digital preservation in the context of institutional repositories , 2006, Program.

[10]  Alma Swan,et al.  The skills, role and career structure of data scientists and curators: An assessment of current practice and future needs , 2008 .

[11]  David J. Brown,et al.  International Council for Scientific and Technical Information (ICSTI) Annual Conference - Managing Data for Science , 2009, Inf. Serv. Use.

[12]  Elizabeth Yakel,et al.  Digital Curation for Digital Natives , 2011 .

[13]  Jeffrey Heer,et al.  Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment , 2013, ICML.

[14]  Jeffrey Heer,et al.  Termite: visualization techniques for assessing textual topic models , 2012, AVI.

[15]  Joyce Ray The Rise of Digital Curation and Cyberinfrastructure: From Experimentation to Implementation and Maybe Integration , 2012, Libr. Hi Tech.

[16]  Helen R. Tibbo Digital Curation Education and Training: From Digitization to Graduate Curricula to MOOCs , 2015 .

[17]  Jeonghyun Kim,et al.  Competency-based Curriculum: An EffectiveApproach to Digital Curation Education , 2015 .

[18]  Neil Beagrie,et al.  Digital Curation for Science, Digital Libraries, and Individuals , 2008, Int. J. Digit. Curation.

[19]  Daniel Jurafsky,et al.  Studying the History of Ideas Using Topic Models , 2008, EMNLP.

[20]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[21]  Jeffrey Heer,et al.  Interpretation and trust: designing model-driven visualizations for text analysis , 2012, CHI.

[22]  Joyce L. Ogburn The Imperative for Data Curation , 2010 .

[23]  Bruce Fulton,et al.  DigIn: A Hands-On Approach to a Digital Curation Curriculum for Professional Development , 2011 .

[24]  P. Bryan Heidorn,et al.  The Emerging Role of Libraries in Data Curation and E-science , 2011 .

[25]  Greg Janée Digital Curation , 2009, Encyclopedia of Database Systems.

[26]  Joyce Ray Sharks, digital curation, and the education of information professionals , 2009 .