On Aggregating Salaries of Occupations From Job Post and Review Data

The popularity of job websites has significantly changed the way people learn about different occupations. Among the insights offered by these websites are the statistics of occupation salaries which are useful information for job seekers, career coaches, graduating students, and labor related government agencies. Such statistics include the distribution of job salaries of each occupation, such as average or quantiles. However, significant variability in salary (and review salary) can be found among jobs of the same occupation as we gather job post and review data from job websites. Such variability shows the existence of biases, including salary competitiveness in job posts and salary inflation in job reviews. Based on the observation, we aim at developing an approach to derive occupation salary for a job market, named unbiased salary, by aggregating offer salaries from job posts and review salaries from review data and at the same time removing their biases. To achieve this goal, we proposed COC-model to learn unbiased salaries of occupations, competitiveness of companies and inflation of companies efficiently. COC here is an abbreviation of “Company, Occupation, Company”, which represents two different connections between companies and occupations from job posting site and job review site. COC-model represents the dependency of salary information between companies and occupations in job post data and job review data. It begins with defining three latent variables, say competitiveness, inflation, and unbiased salary, based on their dependencies. Instead of computing these variables iteratively, we formulate the interaction among these three latent variables into a matrix form so that these values could be then efficiently learned in a unified way by a series of matrix operations. Extensive experiments are conducted, including empirical studies about competitiveness and inflation of companies using real dataset and performance testing by synthetic dataset. The experimental results show that COC-model can not only derive unbiased salaries effectively but also help us to understand latent biases in job post and job review data.

[1]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[2]  H. Arkes The Nonuse of Psychological Research at Two Federal Agencies , 2003, Psychological science.

[3]  Deepak Agarwal,et al.  Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary , 2017, CIKM.

[4]  Ee-Peng Lim,et al.  Talent Flow Analytics in Online Professional Network , 2018, Data Science and Engineering.

[5]  Ee-Peng Lim,et al.  JobSense: A Data-Driven Career Knowledge Exploration Framework and System , 2018, 2018 IEEE International Conference on Data Mining Workshops (ICDMW).

[6]  Jie Chen,et al.  Nonnegative Least-Mean-Square Algorithm , 2011, IEEE Transactions on Signal Processing.

[7]  Hui Xiong,et al.  Large-Scale Talent Flow Forecast with Dynamic Latent Factor Model? , 2019, WWW.

[8]  Shaun Jackman,et al.  Predicting Job Salaries from Text Descriptions , 2013 .

[9]  Hui Xiong,et al.  Extracting Job Title Hierarchy from Career Trajectories: A Bayesian Perspective , 2018, IJCAI.

[10]  Qing He,et al.  The Impact of Person-Organization Fit on Talent Management: A Structure-Aware Convolutional Neural Network Approach , 2019, KDD.

[11]  Yilu Zhou,et al.  Employee Satisfaction and Corporate Performance: Mining Employee Reviews on Glassdoor.com , 2016, ICIS.

[12]  Alina Lazar,et al.  Income prediction via support vector machine , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[13]  Xi Chen,et al.  How LinkedIn Economic Graph Bonds Information and Product: Applications in LinkedIn Salary , 2018, KDD.

[14]  Hui Xiong,et al.  A Joint Learning Approach to Intelligent Job Interview Assessment , 2018, IJCAI.

[15]  Ke Wang,et al.  Quality and Leniency in Online Collaborative Rating Systems , 2012, TWEB.

[16]  Václav Hlavác,et al.  Sequential Coordinate-Wise Algorithm for the Non-negative Least Squares Problem , 2005, CAIP.

[17]  Hui Xiong,et al.  Collaborative Company Profiling: Insights from an Employee's Perspective , 2017, AAAI.

[18]  Hui Xiong,et al.  Measuring the Popularity of Job Skills in Recruitment Market: A Multi-Criteria Approach , 2017, AAAI.

[19]  Huayu Li,et al.  Prospecting the Career Development of Talents: A Survival Analysis Perspective , 2017, KDD.

[20]  Hui Xiong,et al.  Enhancing Person-Job Fit for Talent Recruitment: An Ability-aware Neural Network Approach , 2018, SIGIR.

[21]  Hui Xiong,et al.  Learning Career Mobility and Human Activity Patterns for Job Change Analysis , 2015, 2015 IEEE International Conference on Data Mining.

[22]  Immanuel Bayer fastFM: A Library for Factorization Machines , 2016, J. Mach. Learn. Res..

[23]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[24]  Hui Xiong,et al.  Intelligent Salary Benchmarking for Talent Recruitment: A Holistic Matrix Factorization Approach , 2018, 2018 IEEE International Conference on Data Mining (ICDM).