Boosted Network Classifiers for Local Feature Selection

Like all models, network feature selection models require that assumptions be made on the size and structure of the desired features. The most common assumption is sparsity, where only a small section of the entire network is thought to produce a specific phenomenon. The sparsity assumption is enforced through regularized models, such as the lasso. However, assuming sparsity may be inappropriate for many real-world networks, which possess highly correlated modules. In this paper, we illustrate two novel optimization strategies, namely, boosted expectation propagation (BEP) and boosted message passing (BMP), which directly use the network structure to estimate the parameters of a network classifier. BEP and BMP are ensemble methods that seek to optimize classification performance by combining individual models built upon local network features. Neither BEP nor BMP assumes a sparse solution, but instead they seek a weighted average of all network features where the weights are used to emphasize all features that are useful for classification. In this paper, we compare BEP and BMP with network-regularized logistic regression models on simulated and real biological networks. The results show that, where highly correlated network structure exists, assuming sparsity adversely effects the accuracy and feature selection power of the network classifier.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[3]  Francis R. Bach,et al.  Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[6]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[8]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[9]  Margaret Werner-Washburne,et al.  The genomics of yeast responses to environmental stress and starvation , 2002, Functional & Integrative Genomics.

[10]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[11]  E. Xing,et al.  Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network , 2009, PLoS genetics.

[12]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[13]  S. Lowe,et al.  p53-Dependent apoptosis suppresses tumor growth and progression in vivo , 1994, Cell.

[14]  Na Chen,et al.  Error Analysis for Matrix Elastic-Net Regularization Algorithms , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[16]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[17]  Hiroshi Mamitsuka,et al.  Boosted Optimization for Network Classification , 2010, AISTATS.

[18]  S. Elledge,et al.  BASC, a super complex of BRCA1-associated proteins involved in the recognition and repair of aberrant DNA structures. , 2000, Genes & development.

[19]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[20]  Brendan J. Frey,et al.  A comparison of algorithms for inference and learning in probabilistic graphical models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[22]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[23]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[24]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[25]  Nick Barnes,et al.  Fast and Robust Object Detection Using Asymmetric Totally Corrective Boosting , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Hongmin Li,et al.  A Precisely Regulated Gene Expression Cassette Potently Modulates Metastasis and Survival in Multiple Solid Cancers , 2008, PLoS genetics.

[27]  Ji Zhu,et al.  Boosting as a Regularized Path to a Maximum Margin Classifier , 2004, J. Mach. Learn. Res..

[28]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[29]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[30]  Chunhua Shen,et al.  Boosting Through Optimization of Margin Distributions , 2009, IEEE Transactions on Neural Networks.

[31]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.