Screening for learning classification rules via Boolean compressed sensing

Convex relaxations for sparse representation problems, which aim to find sparse solutions to systems of equations, have enabled a variety of exciting applications in high-dimensional settings. Yet, with dimensions large enough, even these convex formulations become prohibitively expensive. Screening methods attempt to use duality theory to dramatically reduce the size of the optimization problem through easily computable certificates that many of the variables must be zero in the optimal solution. In this paper we consider learning sparse classification rules via Boolean compressed sensing and develop screening procedures that can significantly reduce the size of the resulting linear program. Boolean compressed sensing deals with systems of Boolean equations (instead of linear equations in traditional compressed sensing); we develop screening methods specifically for this setting. We demonstrate the effectiveness of our screening rules on several real-world classification data sets.

[1]  Hao Xu,et al.  Learning Sparse Representations of High Dimensional Data on Large Scale Dictionaries , 2011, NIPS.

[2]  A.C. Gilbert,et al.  Group testing and sparse signal recovery , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[3]  Jie Wang,et al.  Lasso screening rules via dual polytope projection , 2012, J. Mach. Learn. Res..

[4]  Wojciech Kotlowski,et al.  ENDER: a statistical framework for boosting decision rules , 2010, Data Mining and Knowledge Discovery.

[5]  Kush R. Varshney,et al.  Interactive Visual Salesforce Analytics , 2012, ICIS.

[6]  Jieping Ye,et al.  Safe Screening With Variational Inequalities and Its Applicaiton to LASSO , 2013, ICML.

[7]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[8]  Dmitry M. Malioutov,et al.  Boolean compressed sensing: LP relaxation for group testing , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[10]  Sidharth Jaggi,et al.  Non-Adaptive Group Testing: Explicit Bounds and Novel Algorithms , 2014, IEEE Trans. Inf. Theory.

[11]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[12]  Arkadii G. D'yachkov,et al.  A survey of superimposed code theory , 1983 .

[13]  Kristiaan Pelckmans,et al.  An ellipsoid based, two-stage screening test for BPDN , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[14]  Peter J. Ramadge,et al.  Fast lasso screening tests based on correlations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  George Atia,et al.  Boolean Compressed Sensing and Noisy Group Testing , 2009, IEEE Transactions on Information Theory.

[16]  John Shawe-Taylor,et al.  The Set Covering Machine , 2003, J. Mach. Learn. Res..

[17]  Yun Wang,et al.  Tradeoffs in improved screening of lasso problems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Hao Wu,et al.  The 2-codeword screening test for lasso problems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Stefan Kramer,et al.  Margin-Based First-Order Rule Learning , 2006, ILP.

[20]  Laurent El Ghaoui,et al.  Safe Feature Elimination in Sparse Supervised Learning , 2010, ArXiv.

[21]  Kush R. Varshney,et al.  Exact Rule Learning via Boolean Compressed Sensing , 2013, ICML.

[22]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[23]  Yun Wang,et al.  Lasso screening with a small regularization parameter , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  D. Du,et al.  Pooling Designs And Nonadaptive Group Testing: Important Tools For Dna Sequencing , 2006 .

[25]  Venkatesh Saligrama,et al.  Non-adaptive group testing: Explicit bounds and novel algorithms , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[26]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.