Fostering Analytics on Learning Analytics Research: the LAK Dataset

This paper describes the Learning Analytics and Knowledge (LAK) Dataset, an unprecedented collection of structured data created from a set of key research publications in the emerging field of learning analytics. The unstructured publications have been processed and exposed in a variety of formats, most notably according to Linked Data principles, in order to provide simplified access for researchers and practitioners. The aim of this dataset is to provide the opportunity to conduct investigations, for instance, about the evolution of the research field over time, correlations with other disciplines or to provide compelling applications which take advantage of the dataset in an innovative manner. In this paper, we describe the dataset, the design choices and rationale and provide an outlook on future investigations.

[1]  George Siemens,et al.  Penetrating the fog: analytics in learning and education , 2014 .

[2]  N. Stanietsky,et al.  The interaction of TIGIT with PVR and PVRL2 inhibits human NK cell cytotoxicity , 2009, Proceedings of the National Academy of Sciences.

[3]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[4]  Enrico Motta,et al.  Making Sense of Research with Rexplore , 2012, International Semantic Web Conference.

[5]  Marek Hatala,et al.  Towards open ontology learning and filtering , 2011, Inf. Syst..

[6]  Jose Antonio Morán,et al.  Educational monitoring tool based on faceted browsing and data portraits , 2012, LAK '12.

[7]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[8]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[10]  Daniela Giordano,et al.  Connecting Medical Educational Resources to the Linked Data Cloud: the mEducator RDF Schema, Store and API , 2011, Linked Learning@ESWC.

[11]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[12]  Wolfgang Nejdl,et al.  Can Entities be Friends? , 2012, WoLE@ISWC.

[13]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[14]  Martin Wattenberg,et al.  Arc diagrams: visualizing structure in strings , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[15]  Marek Hatala,et al.  Voting Theory for Concept Detection , 2012, ESWC.

[16]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[17]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[18]  Mihhail Matskin,et al.  Elevating Prediction Accuracy in Trust-aware Collaborative Filtering Recommenders through T-index Metric and TopTrustee lists , 2010 .

[19]  Christian Bizer,et al.  Media Meets Semantic Web - How the BBC Uses DBpedia and Linked Data to Make Connections , 2009, ESWC.

[20]  Erik Duval,et al.  Dataset-driven research for improving recommender systems for learning , 2011, LAK.

[21]  Jose Antonio Morán,et al.  Using agglomerative hierarchical clustering to model learner participation profiles in online discussion forums , 2012, LAK '12.

[22]  John Skvoretz,et al.  Node centrality in weighted networks: Generalizing degree and shortest paths , 2010, Soc. Networks.

[23]  Tariq M. Khan,et al.  The relationship between educational performance and online access routines: analysis of students' access to an online discussion forum , 2012, LAK '12.

[24]  Carlos Delgado Kloos,et al.  GLASS: a learning analytics visualization tool , 2012, LAK '12.

[25]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[26]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[27]  Marek Hatala,et al.  Learn-B: a social analytics-enabled tool for self-regulated workplace learning , 2012, LAK '12.

[28]  Hanan Ayad,et al.  Student success system: risk analytics and data visualization using ensembles of predictive models , 2012, LAK.

[29]  Enrico Motta,et al.  Mining Semantic Relations between Research Areas , 2012, SEMWEB.

[30]  Wolfgang Nejdl,et al.  Combining a Co-occurrence-Based and a Semantic Measure for Entity Linking , 2013, ESWC.

[31]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[32]  Jim Gaston,et al.  Sherpa: increasing student success with a recommendation engine , 2012, LAK '12.

[33]  Ulrik Brandes,et al.  Centrality Estimation in Large Networks , 2007, Int. J. Bifurc. Chaos.

[34]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[35]  George Siemens,et al.  Learning analytics and educational data mining: towards communication and collaboration , 2012, LAK.

[36]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[37]  Ágnes Sándor,et al.  Modeling metadiscourse conveying the author's rhetorical strategy in biomedical research abstracts , 2007 .

[38]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[39]  Jens Lehmann,et al.  Assessing Linked Data Mappings Using Network Measures , 2012, ESWC.

[40]  David J. Crandall,et al.  Beyond co-occurrence: discovering and visualizing tag relationships from geo-spatial and temporal similarities , 2012, WSDM '12.

[41]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.

[42]  Felix Jungermann,et al.  Information Extraction with RapidMiner , 2015 .

[43]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[44]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[45]  Simone Paolo Ponzetto,et al.  Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.

[46]  Enrico Motta,et al.  Linking Data across Universities: An Integrated Video Lectures Dataset , 2011, International Semantic Web Conference.

[47]  Lucy T. Nowell,et al.  ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[48]  Alexander O'Connor,et al.  Exploring reflection in online communities , 2012, LAK.

[49]  Antoine Isaac,et al.  data.europeana.eu: The Europeana Linked Open Data Pilot , 2011, Dublin Core Conference.

[50]  Mike Sharkey,et al.  Course correction: using analytics to predict course success , 2012, LAK '12.