Fast Markov Blanket Discovery Without Causal Sufficiency

Faster feature selection algorithms become a necessity as Big Data dictates the zeitgeist. An important class of feature selectors are Markov Blanket (MB) learning algorithms. They are Causal Discovery algorithms that learn the local causal structure of a target variable. A common assumption in their theoretical basis, yet often violated in practice, is causal sufficiency: the requirement that all common causes of the measured variables in the dataset are also in the dataset. Recently, Yu et al. (2018) proposed the M3B algorithm, the first to directly learn the MB without demanding causal sufficiency. The main drawback of M3B is that it is time inefficient, being intractable for high-dimensional inputs. In this paper, we derive the Fast Markov Blanket Discovery Algorithm (FMMB). Empirical results that compare FMMB to M3B on the structural learning task show that FMMB outperforms M3B in terms of time efficiency while preserving structural accuracy. Five real-world datasets where used to contrast both algorithms as feature selectors. Applying NB and SVM classifiers, FMMB achieved a competitive outcome. This method mitigates the curse of dimensionality and inspires the development of local-toglobal algorithms.

[1]  Kui Yu,et al.  Causality-based Feature Selection: Methods and Evaluations , 2019 .

[2]  Yunhai Tong,et al.  Three-Fast-Inter Incremental Association Markov Blanket learning algorithm , 2019, Pattern Recognit. Lett..

[3]  Huanhuan Chen,et al.  Mining Markov Blankets Without Causal Sufficiency , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Verónica Bolón-Canedo,et al.  Recent advances and emerging challenges of feature selection in the context of big data , 2015, Knowl. Based Syst..

[5]  Thomas S. Richardson,et al.  Learning high-dimensional directed acyclic graphs with latent and selection variables , 2011, 1104.5617.

[6]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[7]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part II: Analysis and Extensions , 2010, J. Mach. Learn. Res..

[8]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[9]  Jesper Tegnér,et al.  Towards scalable and data efficient learning of Markov boundaries , 2007, Int. J. Approx. Reason..

[10]  Dimitris Margaritis,et al.  Speculative Markov blanket discovery for optimal feature selection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[11]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[12]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[13]  P. Spirtes,et al.  Ancestral graph Markov models , 2002 .

[14]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[15]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[17]  Hao Wang,et al.  Towards efficient and effective discovery of Markov blankets for feature selection , 2020, Inf. Sci..

[18]  Constantin F. Aliferis,et al.  Algorithms for discovery of multiple Markov boundaries , 2013, J. Mach. Learn. Res..

[19]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[20]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[21]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[22]  Constantin F. Aliferis,et al.  Towards Principled Feature Selection: Relevancy, Filters and Wrappers , 2003 .