DAJEE: A Dataset of Joint Educational Entities for Information Retrieval in Technology Enhanced Learning

In the Technology Enhanced Learning (TEL) community, the problem of conducting reproducible evaluations of recommender systems is still open, due to the lack of exhaustive benchmarks. The few public datasets available in TEL have limitations, being mostly small and local. Recently, Massive Open Online Courses (MOOC) are attracting many studies in TEL, mainly because of the huge amount of data for these courses and their potential for many applications in TEL. This paper presents DAJEE, a dataset built from the crawling of MOOCs hosted on the Coursera platform. DAJEE offers information on the usage of more than 20,000 resources in 407 courses by 484 instructors, with a conjunction of different educational entities in order to store the courses' structure and the instructors' teaching experiences.