Using visual and text features for direct marketing on multimedia messaging services domain

Traditionally, direct marketing companies have relied on pre-testing to select the best offers to send to their audience. Companies systematically dispatch the offers under consideration to a limited sample of potential buyers, rank them with respect to their performance and, based on this ranking, decide which offers to send to the wider population. Though this pre-testing process is simple and widely used, recently the industry has been under increased pressure to further optimize learning, in particular when facing severe time and learning space constraints. The main contribution of the present work is to demonstrate that direct marketing firms can exploit the information on visual content to optimize the learning phase. This paper proposes a two-phase learning strategy based on a cascade of regression methods that takes advantage of the visual and text features to improve and accelerate the learning process. Experiments in the domain of a commercial Multimedia Messaging Service (MMS) show the effectiveness of the proposed methods and a significant improvement over traditional learning techniques. The proposed approach can be used in any multimedia direct marketing domain in which offers comprise both a visual and text component.

[1]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2]  Sebastiano Battiato,et al.  Natural scenes classification for color enhancement , 2005, IEEE Transactions on Consumer Electronics.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[5]  Giovanni Giuffrida,et al.  Data mining learning bootstrap through semantic thumbnail analysis , 2007, Electronic Imaging.

[6]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  B. Julesz Textons, the elements of texture perception, and their interactions , 1981, Nature.

[8]  Joo-Hwee Lim,et al.  Categorizing Visual Contents by Matching Visual "Keywords" , 1999, VISUAL.

[9]  Jitendra Malik,et al.  When is scene recognition just texture recognition , 2010 .

[10]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Nir Oren,et al.  Reexamining tf.idf based information retrieval with Genetic Programming , 2002 .

[12]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Jitendra Malik,et al.  When is scene identification just texture recognition? , 2004, Vision Research.

[14]  David A. Hull Stemming Algorithms: A Case Study for Detailed Evaluation , 1996, J. Am. Soc. Inf. Sci..

[15]  Giovanni Maria Farinella,et al.  Scene categorization using bag of Textons on spatial hierarchy , 2008, 2008 15th IEEE International Conference on Image Processing.

[16]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[17]  Florent Perronnin,et al.  Universal and Adapted Vocabularies for Generic Visual Categorization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Dirk Van den Poel,et al.  Constrained optimization of data-mining problems to improve model performance: A direct-marketing application , 2005, Expert Syst. Appl..

[19]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[20]  M. Potter Meaning in visual search. , 1975, Science.

[21]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[22]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[24]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[25]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[26]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Paul D. Berger,et al.  Direct Marketing Management , 1989 .

[28]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[29]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[30]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[31]  BELA JULESZ,et al.  Rapid discrimination of visual patterns , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[32]  W. Cleveland,et al.  Regression by local fitting: Methods, properties, and computational algorithms , 1988 .

[33]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .