A feature-pair-based associative classification approach to look-alike modeling for conversion-oriented user-targeting in tail campaigns

Online advertising offers significantly finer granularity, which has been leveraged in state-of-the-art targeting methods, like Behavioral Targeting (BT). Such methods have been further complemented by recent work in Look-alike Modeling (LAM) which helps in creating models which are customized according to each advertiser's requirements and each campaign's characteristics, and which show ads to users who are most likely to convert on them, not just click them. In Look-alike Modeling given data about converters and nonconverters, obtained from advertisers, we would like to train models automatically for each ad campaign. Such custom models would help target more users who are similar to the set of converters the advertiser provides. The advertisers get more freedom to define their preferred sets of users which should be used as a basis to build custom targeting models. In behavioral data, the number of conversions (positive class) per campaign is very small (conversions per impression for the advertisers in our data set are much less than 10-4), giving rise to a highly skewed training dataset, which has most records pertaining to the negative class. Campaigns with very few conversions are called as tail campaigns, and those with many conversions are called head campaigns. Creation of Look-alike Models for tail campaigns is very challenging and tricky using popular classifiers like Linear SVM and GBDT, because of the very few number of positive class examples such campaigns contain. In this paper, we present an Associative Classification (AC) approach to LAM for tail campaigns. Pairs of features are used to derive rules to build a Rule-based Associative Classifier, with the rules being sorted by frequency-weighted log-likelihood ratio (F-LLR). The top k rules, sorted by F-LLR, are then applied to any test record to score it. Individual features can also form rules by themselves, though the number of such rules in the top k rules and the whole rule-set is very small. Our algorithm is based on Hadoop, and is thus very efficient in terms of speed.

[1]  Rajesh Parekh,et al.  Large-Scale Customized Models for Advertisers , 2010, 2010 IEEE International Conference on Data Mining Workshops.