The Bayesian group-Lasso for analyzing contingency tables

Group-Lasso estimators, useful in many applications, suffer from lack of meaningful variance estimates for regression coefficients. To overcome such problems, we propose a full Bayesian treatment of the Group-Lasso, extending the standard Bayesian Lasso, using hierarchical expansion. The method is then applied to Poisson models for contingency tables using a highly efficient MCMC algorithm. The simulated experiments validate the performance of this method on artificial datasets with known ground-truth. When applied to a breast cancer dataset, the method demonstrates the capability of identifying the differences in interactions patterns of marker proteins between different patient groups.

[1]  I. Ellis,et al.  Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. , 2002, Histopathology.

[2]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[3]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[4]  Taesung Park,et al.  Bayesian methods for contingency tables using Gibbs sampling , 2004 .

[5]  Christian Pilarsky,et al.  Molecular Profiling of Laser-Microdissected Matched Tumor and Normal Breast Tissue Identifies Karyopherin α2 as a Potential Novel Prognostic Marker in Breast Cancer , 2006, Clinical Cancer Research.

[6]  I. Ellis,et al.  Pathological prognostic factors in breast cancer. , 1999, Critical reviews in oncology/hematology.

[7]  Adrian E. Raftery,et al.  [Practical Markov Chain Monte Carlo]: Comment: One Long Run with Diagnostics: Implementation Strategies for Markov Chain Monte Carlo , 1992 .

[8]  C. Dethlefsen,et al.  Learning Bayesian Networks with , 2003 .

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Florian Steinke,et al.  Bayesian Inference and Optimal Design in the Sparse Linear Model , 2007, AISTATS.

[11]  Lisa A. Carey,et al.  Protein Expression Profiling in High-Risk Breast Cancer Patients Treated with High-Dose or Conventional Dose-Dense Chemotherapy , 2008 .

[12]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[13]  Anil K. Jain,et al.  Bayesian learning of sparse classifiers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Peter Bühlmann,et al.  Penalized likelihood for sparse contingency tables with an application to full-length cDNA libraries , 2007, BMC Bioinformatics.

[15]  J. Kononen,et al.  Tissue microarrays for high-throughput molecular profiling of tumor specimens , 1998, Nature Medicine.

[16]  G. Ball,et al.  High‐throughput protein expression analysis using tissue microarray technology of a large well‐characterised series identifies biologically distinct classes of breast cancer confirming recent cDNA expression analyses , 2005, International journal of cancer.

[17]  Susanne Bottcher,et al.  Learning Bayesian networks with mixed variables , 2001, AISTATS.

[18]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[19]  Claus Dethlefsen,et al.  deal: A Package for Learning Bayesian Networks , 2003 .

[20]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[21]  Volker Roth,et al.  The Group-Lasso for generalized linear models: uniqueness of solutions and efficient algorithms , 2008, ICML '08.