PQC: personalized query classification

Query classification (QC) is a task that aims to classify Web queries into topical categories. Since queries are usually short in length and ambiguous, the same query may need to be classified to different categories according to different people's perspectives. In this paper, we propose the Personalized Query Classification (PQC) task and develop an algorithm based on user preference learning as a solution. Users' preferences that are hidden in clickthrough logs are quite helpful for search engines to improve their understandings of users' queries. We propose to connect query classification with users' preference learning from clickthrough logs for PQC. To tackle the sparseness problem in clickthrough logs, we propose a collaborative ranking model to leverage similar users' information. Experiments on a real world clickthrough log data show that our proposed PQC algorithm can gain significant improvement compared with general QC as well as natural baselines. Our method can be applied to a wide range of applications including personalized search and online advertising.

[1]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[2]  Xuehua Shen,et al.  Context-sensitive information retrieval using implicit feedback , 2005, SIGIR '05.

[3]  Masatoshi Yoshikawa,et al.  Adaptive web search based on user profile constructed without any effort from users , 2004, WWW '04.

[4]  Olivia R. Liu Sheng,et al.  Interest-based personalized search , 2007, TOIS.

[5]  Ying Li,et al.  KDD CUP-2005 report: facing a great challenge , 2005, SKDD.

[6]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[7]  Ji-Rong Wen,et al.  WWW 2007 / Track: Search Session: Personalization A Largescale Evaluation and Analysis of Personalized Search Strategies ABSTRACT , 2022 .

[8]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[9]  ChengXiang Zhai,et al.  Mining long-term search history to improve search accuracy , 2006, KDD '06.

[10]  Ophir Frieder,et al.  Automatic classification of Web queries using very large unlabeled query logs , 2007, TOIS.

[11]  Qiang Yang,et al.  Building bridges for web query classification , 2006, SIGIR.

[12]  Chris H. Q. Ding,et al.  NMF and PLSI: equivalence and a hybrid algorithm , 2006, SIGIR '06.

[13]  Wei-Ying Ma,et al.  Optimizing web search using web click-through data , 2004, CIKM '04.

[14]  Feng Qiu,et al.  Automatic identification of user interest for personalized search , 2006, WWW '06.

[15]  Tie-Yan Liu,et al.  Learning to rank for information retrieval (LR4IR 2007) , 2007, SIGF.

[16]  Enhong Chen,et al.  Context-aware query classification , 2009, SIGIR.

[17]  Alexander J. Smola,et al.  Maximum Margin Matrix Factorization for Collaborative Ranking , 2007 .

[18]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[19]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[20]  Thorsten Joachims,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[21]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[22]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[23]  ChengXiang Zhai,et al.  Implicit user modeling for personalized search , 2005, CIKM '05.

[24]  Qiang Yang,et al.  Q2C@UST: our winning solution to query classification in KDDCUP 2005 , 2005, SKDD.

[25]  Mary Beth Rosson,et al.  Paradox of the active user , 1987 .

[26]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.