Towards the use of Data Engineering, Advanced Visualization techniques and Association Rules to support knowledge discovery for public policies

Abstract Education and employment are key aspects of a country’s well-being. Governments expend valuable resources on designing education plans and employment programs. These two aspects are usually analysed separately, although, as they are closely related, considering them together might improve their efficacy. The problem lies, at least in part, in the fact that different public entities manage their own data with their own isolated systems, and do not develop joint educational and employment policies. In order to facilitate working towards this goal, in this manuscript, we make use of Data Engineering, Data Visualization, and Intelligent Data Analytics methods to create a decision support system for the Government of Extremadura. Extremadura is a European Union Objective 1 region in Spain with high rates of unemployment and secondary school drop-out. Data Engineering is used to create a Data Warehouse that unifies the different data sources into a central repository for quick access and control. This allows dealing with the challenge of transforming, processing, storing and accessing the data. Data Visualization techniques are applied to create an interactive dashboard that assists users in analysing and interpreting the data in the Data Warehouse repository. Thus, charts, diagrams, and maps are created specifically to help technical or political decision-makers. Finally, Intelligent Data Analytics techniques are used to incorporate Association Rules into the visualization dashboard. Its goal is to identify associations, relationships, and patterns in data that, at least in plain sight, are not readable or interpretable by humans. It does this by inferring knowledge that humans cannot pick out by themselves. As a result, a complete system was defined and implemented to support public administrations in their decision-making and definition of precise evidence-based policies in the areas of education and employment. In particular, it allows the definition of unified strategies to reduce the unemployment rate.

[1]  Bart Baesens,et al.  Evaluating recommendation and search in the labor market , 2018, Knowl. Based Syst..

[2]  Neema Mduma,et al.  An Ensemble Predictive Model Based Prototype for Student Drop-out in Secondary Schools , 2019 .

[3]  Sotiris B. Kotsiantis,et al.  A combinational incremental ensemble of classifiers as a technique for predicting students' performance in distance education , 2010, Knowl. Based Syst..

[4]  Christian Borgelt,et al.  Frequent item set mining , 2012, WIREs Data Mining Knowl. Discov..

[5]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[6]  Francisco J. García-Peñalvo,et al.  Data Analysis Platform for the Optimization of Employability in Technological Profiles , 2019, PAAMS.

[7]  Renzo Sprugnoli,et al.  Data mining models for student careers , 2015, Expert Syst. Appl..

[8]  Charles D. Tupper Dimensional Warehouses from Enterprise Models , 2011 .

[9]  Francisco J. García-Peñalvo,et al.  Proposing a Machine Learning Approach to Analyze and Predict Employment and its Factors , 2018, Int. J. Interact. Multim. Artif. Intell..

[10]  Chi-Cheng Chang Improving employment services management using IPA technique , 2013, Expert Syst. Appl..

[11]  Jiunn-I Shieh,et al.  Evaluating performance criteria of Employment Service Outreach Program personnel by DEMATEL method , 2010, Expert Syst. Appl..

[12]  Manisha,et al.  A Unified Model of Clustering and Classification to Improve Students’ Employability Prediction , 2017 .

[13]  N. Rabin,et al.  Modeling and Analysis of Students' Performance Trajectories using Diffusion Maps and Kernel Two-Sample Tests , 2019, Eng. Appl. Artif. Intell..

[14]  Mario Schmidt,et al.  The Sankey Diagram in Energy and Material Flow Management , 2008 .

[15]  Haixiang Guo,et al.  A better estimate to the contribution rate of education on economic growth in China from 1999 to 2003 , 2008, Expert Syst. Appl..

[16]  Gwo-Hshiung Tzeng,et al.  Evaluating intertwined effects in e-learning programs: A novel hybrid MCDM model based on factor analysis and DEMATEL , 2007, Expert Syst. Appl..

[17]  Antonio Corral,et al.  A Comparison of Feature Selection Methods to Optimize Predictive Models Based on Decision Forest Algorithms for Academic Data Analysis , 2018, WorldCIST.

[18]  David Baneres,et al.  An Early Feedback Prediction System for Learners At-Risk Within a First-Year Higher Education Course , 2019, IEEE Transactions on Learning Technologies.

[19]  Syed Abbas Ali,et al.  Analyzing undergraduate students' performance using educational data mining , 2017, Comput. Educ..

[20]  Francisco J. García-Peñalvo,et al.  Taking advantage of the software product line paradigm to generate customized user interfaces for decision-making processes: a case study on university employability , 2019, PeerJ Comput. Sci..

[21]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[22]  Di PietroLaura,et al.  Reconciling internal and external performance in a holistic approach , 2015 .

[23]  Denise Jackson,et al.  Factors influencing job attainment in recent Bachelor graduates: evidence from Australia , 2014 .