Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block Designs

In this paper we propose a method for scaling up filterbased feature selection in classification problems. We use the conditional mutual information as filter measure and show how the required statistics can be computed in parallel avoiding unnecessary calculations. The distribution of the calculations between the available computing units is determined based on balanced incomplete block designs, a strategy first developed within the area of statistical design of experiments. We show the scalability of our method through a series of experiments on synthetic and real-world datasets.

[1]  R. Fisher An examination of the different possible solutions of a problem in incomplete blocks. , 1940 .

[2]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[3]  Charles J. Colbourn,et al.  Combinatorial Designs , 1999, Handbook of Discrete and Combinatorial Mathematics.

[4]  K. Takeuchi,et al.  A Table of Difference Sets Generating Balanced Incomplete Block Designs , 1962 .

[5]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[6]  John E. Moody,et al.  Data Visualization and Feature Selection: New Algorithms for Nongaussian Data , 1999, NIPS.

[7]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[8]  Claus Skaanning,et al.  The SACSO System for Troubleshooting of Printing Systems , 2001, SCAI.

[9]  Finn Verner Jensen,et al.  MUNIN: an expert EMG assistant , 1988 .

[10]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[11]  Derek G. Corneil,et al.  Algorithmic Techniques for the Generation and Analysis of Strongly Regular Graphs and other Combinatorial Configurations , 1978 .

[12]  Wlodzislaw Duch Filter methods , 2004 .

[13]  Jose Miguel Puerta,et al.  A GRASP algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets , 2011, Pattern Recognit. Lett..

[14]  Harikrishna Narasimhan,et al.  Bayes Optimal Feature Selection for Supervised Learning with General Performance Measures , 2015, UAI.

[15]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[16]  Anders L. Madsen,et al.  Parallelisation of the PC Algorithm , 2015, CAEPIA.

[17]  Vipin Kumar,et al.  Feature Selection: A literature Review , 2014, Smart Comput. Rev..

[18]  Ewart R. Carson,et al.  A Model-Based Approach to Insulin Adjustment , 1991, AIME.

[19]  W. R. Shao,et al.  Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis , 2008 .

[20]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[21]  Jose Miguel Puerta,et al.  Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking , 2012, Knowl. Based Syst..

[22]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[23]  Uffe Kjærulff,et al.  Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis , 2007, Information Science and Statistics.

[24]  Verónica Bolón-Canedo,et al.  A Time Efficient Approach for Distributed Feature Selection Partitioning by Features , 2015, CAEPIA.

[25]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[26]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[27]  Anders L. Madsen,et al.  A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs , 2014, Probabilistic Graphical Models.