Mixed and Covariate Dependent Graphical Models.

Mixed and Covariate Dependent Graphical Models by Jie Cheng Co-Chairs: Assoc. Prof. Elizaveta Levina and Prof. Ji Zhu Graphical models have proven to be a useful tool in understanding the conditional dependency structure of multivariate distributions. In Chapters II and III of the thesis, we consider two types of undirected graphical models that are motivated by particular types of applications. The first model we consider is a mixed graphical model, linking both continuous and discrete variables. The proposed model is simple enough to be suitable for high-dimensional data, yet flexible enough to represent all possible graph structures for mixed types of data. We develop a computationally efficient regression-based algorithm for fitting the model by focusing on the conditional log-likelihood of each variable given the rest. The parameters have a natural group structure, and sparsity in the fitted graph is attained by incorporating a group lasso penalty, approximated by a weighted `1 penalty for computational efficiency. We demonstrate the effectiveness of our method through an extensive simulation study and apply it to a music annotation data set (CAL500), obtaining a sparse and interpretable graphical model relating the continuous features of the audio signal to categorical variables such as genre, emotions, and usage associated with particular songs.

[1]  M. Aapro,et al.  Genetic alterations of c-myc, c-erbB-2, and c-Ha-ras protooncogenes and clinical associations in human breast carcinomas. , 1989, Cancer research.

[2]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[3]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[4]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[5]  David Botstein,et al.  Different gene expression patterns in invasive lobular and ductal carcinomas of the breast. , 2004, Molecular biology of the cell.

[6]  S. Lauritzen,et al.  Mixed graphical association models; discussions and reply , 1989 .

[7]  Trevor J. Hastie,et al.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[8]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[9]  C. M. Steel,et al.  Partial deletion of chromosome 11p in breast cancer correlates with size of primary tumour and oestrogen receptor level. , 1988, British Journal of Cancer.

[10]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[11]  C W Caldwell,et al.  Genetic alterations of microsatellites on chromosome 18 in human breast carcinoma. , 1995, Diagnostic molecular pathology : the American journal of surgical pathology, part B.

[12]  Pei Wang,et al.  Learning networks from high dimensional binary data: An application to genomic instability data , 2009, 0908.3882.

[13]  P L Pearson,et al.  Allelotype of human breast carcinoma: a second major site for loss of heterozygosity is on chromosome 6q. , 1991, Oncogene.

[14]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[15]  M. Hassner,et al.  The use of Markov Random Fields as models of texture , 1980 .

[16]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[17]  Trevor J. Hastie,et al.  Learning Mixed Graphical Models , 2012, ArXiv.

[18]  P. Zhao,et al.  A path following algorithm for Sparse Pseudo-Likelihood Inverse Covariance Estimation (SPLICE) , 2008, 0807.3734.

[19]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[20]  Iqbal Unnisa Ali,et al.  Reduction to homozygosity of genes on chromosome 11 in human breast neoplasia , 1987 .

[21]  Daniel T Blumstein,et al.  The sound of arousal in music is context-dependent , 2012, Biology Letters.

[22]  Robert Tibshirani,et al.  Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods , 2009, J. Mach. Learn. Res..

[23]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[24]  Hongzhe Li,et al.  A SPARSE CONDITIONAL GAUSSIAN GRAPHICAL MODEL FOR ANALYSIS OF GENETICAL GENOMICS DATA. , 2011, The annals of applied statistics.

[25]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[26]  George Michailidis,et al.  ESTIMATING HETEROGENEOUS GRAPHICAL MODELS FOR DISCRETE DATA WITH AN APPLICATION TO ROLL CALL VOTING. , 2015, The annals of applied statistics.

[27]  Jürgen Geisler,et al.  TP53 gene mutations predict the response to neoadjuvant treatment with 5-fluorouracil and mitomycin in locally advanced breast cancer. , 2003, Clinical cancer research : an official journal of the American Association for Cancer Research.

[28]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[29]  Jesse Read,et al.  A Pruned Problem Transformation Method for Multi-label Classification , 2008 .

[30]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[31]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[32]  Peter Bühlmann,et al.  Stable graphical model estimation with Random Forests for discrete, continuous, and mixed variables , 2011, Comput. Stat. Data Anal..

[33]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[34]  J. Woods Markov image modeling , 1976, 1976 IEEE Conference on Decision and Control including the 15th Symposium on Adaptive Processes.

[35]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[36]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[37]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[38]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[39]  B. Johansson,et al.  A breakpoint map of recurrent chromosomal rearrangements in human neoplasia , 1997, Nature Genetics.

[40]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[41]  Xi Chen,et al.  Graph-Valued Regression , 2010, NIPS.

[42]  Daniel Birnbaum,et al.  Carcinogenesis and translational controls: TACC1 is down-regulated in human cancers and associates with mRNA regulators , 2002, Oncogene.

[43]  Ming Yuan,et al.  High Dimensional Inverse Covariance Matrix Estimation via Linear Programming , 2010, J. Mach. Learn. Res..

[44]  Hongzhe Li,et al.  Covariate-Adjusted Precision Matrix Estimation with an Application in Genetical Genomics. , 2013, Biometrika.

[45]  E. Levina,et al.  Joint Structure Estimation for Categorical Markov Networks , 2010 .

[46]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[47]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[48]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[49]  L. Aaltonen,et al.  Allelic analysis of serous ovarian carcinoma reveals two putative tumor suppressor loci at 18q22-q23 distal to SMAD4, SMAD2, and DCC. , 2001, The American journal of pathology.

[50]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[51]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[52]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[53]  B. Clurman,et al.  p53 and SCFFbw7 cooperatively restrain cyclin E-associated genome instability , 2007, Oncogene.

[54]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[55]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[56]  Bin Yu,et al.  Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of boldmathell_1-regularized MLE , 2008, NIPS 2008.

[57]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[58]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[59]  E. Ising Beitrag zur Theorie des Ferromagnetismus , 1925 .

[60]  Naonori Ueda,et al.  Parametric Mixture Models for Multi-Labeled Text , 2002, NIPS.

[61]  Satyabrata Sinha,et al.  Alterations in candidate genes PHF2, FANCC, PTCH1 and XPA at chromosomal 9q22.3 region: Pathological significance in early- and late-onset breast carcinoma , 2008, Molecular Cancer.

[62]  Martin S. Kochmanski NOTE ON THE E. ISING'S PAPER ,,BEITRAG ZUR THEORIE DES FERROMAGNETISMUS" (Zs. Physik, 31, 253 (1925)) , 2008 .

[63]  Jieping Ye,et al.  Efficient Methods for Overlapping Group Lasso , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Judith Abrams,et al.  Multiple interacting oncogenes on the 8p11-p12 amplicon in human breast cancer. , 2006, Cancer research.

[65]  A. Børresen-Dale,et al.  Chromosome region 8p11‐p21: Refined mapping and molecular alterations in breast cancer , 1998, Genes, chromosomes & cancer.

[66]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[67]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[68]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[69]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[70]  Robert Tibshirani,et al.  Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene‐expression subtypes of breast cancer , 2006, Genes, chromosomes & cancer.

[71]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[72]  Martin J. Wainwright,et al.  Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of l1-regularized MLE , 2008, NIPS.

[73]  Koby Crammer,et al.  A Family of Additive Online Algorithms for Category Ranking , 2003, J. Mach. Learn. Res..

[74]  C. Croce,et al.  Suppression of tumorigenicity of breast cancer cells by microcell-mediated chromosome transfer: studies on chromosomes 6 and 11. , 1994, Cancer research.

[75]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[76]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[77]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[78]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[79]  J. Friedman,et al.  New Insights and Faster Computations for the Graphical Lasso , 2011 .