Efficient Feature Selection Framework for Digital Marketing Applications

Digital marketing strategies can help businesses achieve better Return on Investment (ROI). Big data and predictive modelling are key to identifying these specific customers. Yet the very rich and mostly irrelevant attributes(features) will adversely affect the predictive modelling performance, both computationally and qualitatively. So selecting relevant features is a crucial task for marketing applications. The feature selection process is very time consuming due to the large amount of data and high dimensionality of features. In this paper, we propose to reduce the computation time through regularizing the feature search process using expert knowledge. We also combine the regularized search with a generative filtering step, so we can address potential problems with the regularized search and further speed up the process. In addition, a progressive sampling and coarse to fine selection framework is built to further lower the space and time requirements.

[1]  Yungho Leu,et al.  A novel hybrid feature selection method for microarray data analysis , 2011, Appl. Soft Comput..

[2]  Paul R. Kroeger Analyzing Grammar: Frontmatter , 2005 .

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Kari Torkkola,et al.  Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..

[5]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Huan Liu,et al.  Unsupervised Feature Selection for Multi-View Data in Social Media , 2013, SDM.

[7]  Bhanukiran Vinzamuri,et al.  Feature Grouping Using Weighted l1 Norm for High-Dimensional Data , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[8]  Andrew Stranieri,et al.  Hybrid Wrapper-Filter Approaches for Input Feature Selection Using Maximum Relevance and Artificial Neural Network Input Gain Measurement Approximation (ANNIGMA) , 2010, 2010 Fourth International Conference on Network and System Security.

[9]  Lei Yu,et al.  Fast Correlation Based Filter (FCBF) with a different search strategy , 2008, 2008 23rd International Symposium on Computer and Information Sciences.

[10]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[11]  S B Kotsiantis,et al.  RETRACTED ARTICLE: Feature selection for machine learning classification problems: a recent overview , 2014, Artificial Intelligence Review.

[12]  P. Kroeger Analyzing Grammar: An Introduction , 2005 .

[13]  Vipin Kumar,et al.  Feature Selection: A literature Review , 2014, Smart Comput. Rev..

[14]  Mohamed S. Kamel,et al.  An Efficient Greedy Method for Unsupervised Feature Selection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Jos'e R. Berrendero,et al.  The mRMR variable selection method: a comparative study for functional data , 2015, 1507.03496.

[16]  Andrew Stranieri,et al.  Hybrid Wrapper-filter Aapproaches for Input Feature Selection using Maximum relevance-Minimum redundancy and Artificial Neural Network Input Gain Measurement Approximation (ANNIGMA) , 2011, ACSC.

[17]  James Bailey,et al.  Effective global approaches for mutual information based feature selection , 2014, KDD.

[18]  Jie Hu,et al.  Research of new strategies for improving CBR system , 2012, Artificial Intelligence Review.

[19]  Sethuraman Panchanathan,et al.  Efficient Approximate Solutions to Mutual Information Based Global Feature Selection , 2015, 2015 IEEE International Conference on Data Mining.

[20]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[21]  Paul R. Kroeger,et al.  Analyzing Grammar: List of abbreviations , 2005 .

[22]  Karthik Thyagarajan Iyer Computational complexity of data mining algorithms used in fraud detection , 2015 .

[23]  Philip S. Yu,et al.  Online Unsupervised Multi-view Feature Selection , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[24]  Ying Cui,et al.  Convex Principal Feature Selection , 2010, SDM.

[25]  Huan Liu,et al.  Feature selection for classification: A review , 2014 .

[26]  Kan Deng,et al.  Omega: on-line memory-based general purpose system classifier , 1999 .