In sponsored search auctions, the auctioneer operates the marketplace by setting a number of auction parameters such as reserve prices for the task of auction optimization. The auction parameters may be set for each individual keyword, but the optimization problem becomes intractable since the number of keywords is in the millions. To reduce the dimensionality and generalize well, one wishes to cluster keywords or queries into meaningful groups, and set parameters at the keyword-cluster level. For auction optimization, keywords shall be deemed as interchangeable commodities with respect to their valuations from advertisers, represented as bid distributions or landscapes. Clustering keywords for auction optimization shall thus be based on their bid distributions. In this paper we present a formalism of clustering probability distributions, and its application to query clustering where each query is represented as a probability density of click-through rate (CTR) weighted bid and distortion is measured by KL divergence. We first derive a k-means variant for clustering Gaussian densities, which have a closed-form KL divergence. We then develop an algorithm for clustering Gaussian mixture densities, which generalize a single Gaussian and are typically a more realistic parametric assumption for real-world data. The KL divergence between Gaussian mixture densities is no longer analytically tractable; hence we derive a variational EM algorithm that minimizes an upper bound of the total within-cluster KL divergence. The clustering algorithm has been deployed successfully into production, yielding significant improvement in revenue and clicks over the existing production system. While motivated by the specific setting of query clustering, the proposed clustering method is generally applicable to many real-world applications where an example is better characterized by a distribution than a finite-dimensional feature vector in Euclidean space as in the classical k-means.
[1]
Peter B. Key,et al.
Stochastic variability in sponsored search auctions: observations and models
,
2011,
EC '11.
[2]
Inderjit S. Dhillon,et al.
Clustering with Bregman Divergences
,
2005,
J. Mach. Learn. Res..
[3]
Ye Chen,et al.
Position-normalized click prediction in search advertising
,
2012,
KDD.
[4]
M. Do.
Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models
,
2003,
IEEE Signal Processing Letters.
[5]
John F. Canny,et al.
Factor Modeling for Advertisement Targeting
,
2009,
NIPS.
[6]
Thomas M. Cover,et al.
Elements of Information Theory
,
2005
.
[7]
Aleksandrs Slivkins,et al.
Multi-armed bandits on implicit metric spaces
,
2011,
NIPS.
[8]
J. Gittins.
Bandit processes and dynamic allocation indices
,
1979
.
[9]
R. Preston McAfee,et al.
Efficient Ranking in Sponsored Search
,
2011,
WINE.
[10]
Joaquin Quiñonero Candela,et al.
Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine
,
2010,
ICML.
[11]
G. Glass.
Primary, Secondary, and Meta-Analysis of Research
,
2008
.
[12]
Michael Ostrovsky,et al.
Reserve Prices in Internet Advertising Auctions: A Field Experiment
,
2009,
Journal of Political Economy.
[13]
John R. Hershey,et al.
Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models
,
2007,
2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.