Document-based Recommender System for Job Postings using Dense Representations

Job boards and professional social networks heavily use recommender systems in order to better support users in exploring job advertisements. Detecting the similarity between job advertisements is important for job recommendation systems as it allows, for example, the application of item-to-item based recommendations. In this work, we research the usage of dense vector representations to enhance a large-scale job recommendation system and to rank German job advertisements regarding their similarity. We follow a two-folded evaluation scheme: (1) we exploit historic user interactions to automatically create a dataset of similar jobs that enables an offline evaluation. (2) In addition, we conduct an online A/B test and evaluate the best performing method on our platform reaching more than 1 million users. We achieve the best results by combining job titles with full-text job descriptions. In particular, this method builds dense document representation using words of the titles to weigh the importance of words of the full-text description. In the online evaluation, this approach allows us to increase the click-through rate on job recommendations for active users by 8.0%.

[1]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[2]  Scharolta Katharina Siencnik Adapting word2vec to Named Entity Recognition , 2015, NODALIDA.

[3]  Jan Pichl,et al.  Sentence Pair Scoring: Towards Unified Framework for Text Comprehension , 2016, 1603.06127.

[4]  Omer Levy,et al.  A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments , 2016, EACL.

[5]  Iryna Gurevych,et al.  Answering Learners’ Questions by Retrieving Question Paraphrases from Social Q&A Sites , 2008 .

[6]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[7]  Thomas Hofmann,et al.  Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization , 1999, NIPS.

[8]  James P. Callan,et al.  Training algorithms for linear text classifiers , 1996, SIGIR '96.

[9]  Núria Bel,et al.  Ranking Job Offers for Candidates: learning hidden knowledge from Big Data , 2014, LREC.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Pasquale Lops,et al.  Content-based Recommender Systems: State of the Art and Trends , 2011, Recommender Systems Handbook.

[12]  Mark S. Fox,et al.  Semantic Matchmaking for Job Recruitment: An Ontology-Based Hybrid Approach , 2009 .

[13]  Makbule Gulcin Ozsoy,et al.  From Word Embeddings to Item Recommendation , 2016, ArXiv.

[14]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[15]  Martha Larson,et al.  RecSys Challenge 2016: Job Recommendations , 2016, RecSys.

[16]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[17]  Pasquale Lops,et al.  Word Embedding Techniques for Content-based Recommender Systems: An Empirical Evaluation , 2015, RecSys Posters.

[18]  Wei Wang,et al.  Recommender system application developments: A survey , 2015, Decis. Support Syst..

[19]  Ron Kohavi,et al.  Responsible editor: R. Bayardo. , 2022 .

[20]  Minmin Chen,et al.  Efficient Vector Representation for Documents through Corruption , 2017, ICLR.

[21]  Adel Said Elmaghraby,et al.  UofL at SemEval-2016 Task 4: Multi Domain word2vec for Twitter Sentiment Classification , 2016, *SEMEVAL.

[22]  Yehuda Koren,et al.  Advances in Collaborative Filtering , 2011, Recommender Systems Handbook.

[23]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[24]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[25]  Mathieu Roche,et al.  Automatic Profiling System for Ranking Candidates Answers in Human Resources , 2008, OTM Workshops.

[26]  Xiaoping Liu,et al.  Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model , 2017, Int. J. Geogr. Inf. Sci..

[27]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.