Skills2Graph: Processing million Job Ads to face the Job Skill Mismatch Problem

In this paper, we present skills2graph, a tool that, starting from a set of users’ professional skills, identifies the most suitable jobs as they emerge from a large corpus of 2.5M+ Online Job Vacancies (OJVs) posted in three different countries (the United Kingdom, France, and Germany). To this aim, we rely both on co-occurrence statistics computing a count-based measure of skill-relevance named Revealed Comparative Advantage (rca) and distributional semantics generating several embeddings on the OJVs corpus and performing an intrinsic evaluation of their quality. Results, evaluated through a user study of 10 labor market experts, show a high P@3 for the recommendations provided by skills2graph, and a high nDCG (0.985 and 0.984 in a [0,1] range), that indicates a strong correlation between the experts’ scores and the rankings generated by skills2graph.

[1]  Deepak Agarwal,et al.  GLMix: Generalized Linear Mixed Models For Large-Scale Response Prediction , 2016, KDD.

[2]  Fabio Mercorio,et al.  AI meets labor market: Exploring the link between automation and skills , 2019, Information Economics and Policy.

[3]  Fabio Mercorio,et al.  Skills2Job: A recommender system that encodes job offer embeddings on graph databases , 2021, Appl. Soft Comput..

[4]  Nigel Collier,et al.  SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity , 2017, *SEMEVAL.

[5]  Hui Xiong,et al.  Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach , 2017, AAAI.

[6]  Stefan Plantikow,et al.  Cypher: An Evolving Query Language for Property Graphs , 2018, SIGMOD Conference.

[7]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[8]  Fabio Mercorio,et al.  NEO: A Tool for Taxonomy Enrichment with New Emerging Occupations , 2020, SEMWEB.

[9]  Roberto Boselli,et al.  Classifying online Job Advertisements through Machine Learning , 2018, Future Gener. Comput. Syst..

[10]  Abdelmajid Ben Hamadou,et al.  Taxonomy-based information content and wordnet-wiktionary-wikipedia glosses for semantic relatedness , 2015, Applied Intelligence.

[11]  Kuansan Wang,et al.  A Scalable Hybrid Research Paper Recommender System for Microsoft Academic , 2019, WWW.

[12]  Iyad Rahwan,et al.  Unpacking the polarization of workplace skills , 2018, Science Advances.

[13]  Fabio Mercorio,et al.  GraphLMI: A data driven system for exploring labor market information through graph databases , 2020, Multimedia Tools and Applications.

[14]  Gabriella Pasi,et al.  WoLMIS: a labor market intelligence system for classifying web job vacancies , 2017, Journal of Intelligent Information Systems.