Transductive Optimization of Top k Precision

Consider a binary classification problem in which the learner is given a labeled training set, an unlabeled test set, and is restricted to choosing exactly $k$ test points to output as positive predictions. Problems of this kind---{\it transductive precision@$k$}---arise in information retrieval, digital advertising, and reserve design for endangered species. Previous methods separate the training of the model from its use in scoring the test points. This paper introduces a new approach, Transductive Top K (TTK), that seeks to minimize the hinge loss over all training instances under the constraint that exactly $k$ test instances are predicted as positive. The paper presents two optimization methods for this challenging problem. Experiments and analysis confirm the importance of incorporating the knowledge of $k$ into the learning process. Experimental evaluations of the TTK approach show that the performance of TTK matches or exceeds existing state-of-the-art methods on 7 UCI datasets and 3 reserve design problem instances.

[1]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[2]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[3]  Prateek Jain,et al.  Surrogate Functions for Maximizing Precision at the Top , 2015, ICML.

[4]  Shivani Agarwal,et al.  The Infinite Push: A New Support Vector Ranking Algorithm that Directly Optimizes Accuracy at the Absolute Top of the List , 2011, SDM.

[5]  Patrick Gallinari,et al.  Ranking with ordered weighted pairwise classification , 2009, ICML '09.

[6]  Stephen P. Boyd,et al.  Accuracy at the Top , 2012, NIPS.

[7]  Alain Rakotomamonjy,et al.  Sparse Support Vector Infinite Push , 2012, ICML.

[8]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[9]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[10]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[11]  Ivor W. Tsang,et al.  Convex and scalable weakly labeled SVMs , 2013, J. Mach. Learn. Res..

[12]  Cynthia Rudin,et al.  The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List , 2009, J. Mach. Learn. Res..

[13]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[14]  S. Sathiya Keerthi,et al.  Large scale semi-supervised linear SVMs , 2006, SIGIR.

[15]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[16]  Hong Wang,et al.  Adversarial Prediction Games for Multivariate Losses , 2015, NIPS.

[17]  Brian L. Sullivan,et al.  eBird: A citizen-based bird observation network in the biological sciences , 2009 .

[18]  Rong Jin,et al.  Top Rank Optimization in Linear Time , 2014, NIPS.