Query-dependent cross-domain ranking in heterogeneous network

Traditional learning-to-rank problem mainly focuses on one single type of objects. However, with the rapid growth of the Web 2.0, ranking over multiple interrelated and heterogeneous objects becomes a common situation, e.g., the heterogeneous academic network. In this scenario, one may have much training data for some type of objects (e.g. conferences) while only very few for the interested types of objects (e.g. authors). Thus, the two important questions are: (1) Given a networked data set, how could one borrow supervision from other types of objects in order to build an accurate ranking model for the interested objects with insufficient supervision? (2) If there are links between different objects, how can we exploit their relationships for improved ranking performance? In this work, we first propose a regularized framework called HCDRank to simultaneously minimize two loss functions related to these two domains. Then, we extend the approach by exploiting the link information between heterogeneous objects. We conduct a theoretical analysis to the proposed approach and derive its generalization bound to demonstrate how the two related domains could help each other in learning ranking functions. Experimental results on three different genres of data sets demonstrate the effectiveness of the proposed approaches.

[1]  Tao Qin,et al.  Learning to rank relational objects and its application to web search , 2008, WWW.

[2]  Ziv Bar-Yossef,et al.  Cluster ranking with an application to mining mailbox networks , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[4]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[5]  Hongyan Liu,et al.  TagClus: a random walk-based method for tag clustering , 2010, Knowledge and Information Systems.

[6]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[7]  Changshui Zhang,et al.  Transferred Dimensionality Reduction , 2008, ECML/PKDD.

[8]  Soumen Chakrabarti,et al.  Learning to rank networked entities , 2006, KDD '06.

[9]  Yi Su,et al.  Model Adaptation via Model Interpolation and Boosting for Web Search Ranking , 2009, EMNLP.

[10]  Christos Faloutsos,et al.  PEGASUS: mining peta-scale graphs , 2011, Knowledge and Information Systems.

[11]  Ya Zhang,et al.  Multi-task learning for boosting with application to web search ranking , 2010, KDD.

[12]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[13]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[14]  Ruoming Jin,et al.  A Topic Modeling Approach and Its Integration into the Random Walk Framework for Academic Search , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[15]  Rong Jin,et al.  Semi-Supervised Ensemble Ranking , 2008, AAAI.

[16]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[17]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[18]  W. Marsden I and J , 2012 .

[19]  Piotr Jedrzejowicz,et al.  A New Cluster-based Instance Selection Algorithm , 2011, KES-AMSTA.

[20]  John D. Lafferty,et al.  Model-based feedback in the language modeling approach to information retrieval , 2001, CIKM '01.

[21]  Liangxiao Jiang,et al.  Learning decision tree for ranking , 2009, Knowledge and Information Systems.

[22]  Yizhou Sun,et al.  Heterogeneous source consensus learning via decision propagation and negotiation , 2009, KDD.

[23]  Svetha Venkatesh,et al.  Nonnegative shared subspace learning and its application to social media retrieval , 2010, KDD.

[24]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[25]  Vangelis Metsis,et al.  Boosted ranking models: a unifying framework for ranking predictions , 2011, Knowledge and Information Systems.

[26]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[27]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[28]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[29]  Kevin Duh,et al.  Learning to rank with partially-labeled data , 2008, SIGIR '08.

[30]  Wei Fan,et al.  Heterogeneous cross domain ranking in latent space , 2009, CIKM.

[31]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[32]  Philip S. Yu,et al.  Transfer Learning on Heterogenous Feature Spaces via Spectral Transformation , 2010, 2010 IEEE International Conference on Data Mining.

[33]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[34]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[35]  Larry P. Heck,et al.  Trada: tree based ranking function adaptation , 2008, CIKM '08.

[36]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[37]  Steffen Bickel,et al.  Discriminative learning for differing training and test distributions , 2007, ICML '07.

[38]  Qiang Yang,et al.  Can chinese web pages be classified with english data source? , 2008, WWW.

[39]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[40]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[41]  Massih-Reza Amini,et al.  A boosting algorithm for learning bipartite ranking functions with partially labeled data , 2008, SIGIR '08.

[42]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[43]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[44]  Raymond J. Mooney,et al.  Transfer Learning from Minimal Target Data by Mapping across Relational Domains , 2009, IJCAI.

[45]  Bo Chen,et al.  Mining employment market via text block detection and adaptive cross-domain information extraction , 2009, SIGIR.

[46]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[47]  Jingrui He,et al.  Graph-based transfer learning , 2009, CIKM.

[48]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[49]  Tony Jebara,et al.  Multi-task feature and kernel selection for SVMs , 2004, ICML.

[50]  Ulf Brefeld,et al.  {AUC} maximizing support vector learning , 2005 .

[51]  Ulf Brefeld,et al.  Co-EM support vector learning , 2004, ICML.

[52]  Josiane Mothe,et al.  How many performance measures to evaluate information retrieval systems? , 2011, Knowledge and Information Systems.

[53]  Daphne Koller,et al.  Learning a meta-level prior for feature relevance from multiple related tasks , 2007, ICML '07.

[54]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[55]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[56]  Qiang Yang,et al.  EigenTransfer: a unified framework for transfer learning , 2009, ICML '09.

[57]  Deepak S. Turaga,et al.  Cross domain distribution adaptation via kernel mapping , 2009, KDD.

[58]  Bo Wang,et al.  Expert2Bólè: From Expert Finding to Bólè Search , 2009 .

[59]  Jing Peng,et al.  Latent space domain transfer between high dimensional overlapping distributions , 2009, WWW '09.

[60]  Xian-Sheng Hua,et al.  Ranking Model Adaptation for Domain-Specific Search , 2012, IEEE Trans. Knowl. Data Eng..

[61]  Xiangji Huang,et al.  Integrating multiple document features in language models for expert finding , 2010, Knowledge and Information Systems.

[62]  Ireneusz Czarnowski Cluster-based instance selection for machine classification , 2010, Knowledge and Information Systems.

[63]  Qiang Yang,et al.  Transfer learning for collaborative filtering via a rating-matrix generative model , 2009, ICML '09.

[64]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[65]  Koby Crammer,et al.  Learning Bounds for Domain Adaptation , 2007, NIPS.

[66]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.