A Stepwise Approach of Finding Dependent Variables via Coefficient of Intrinsic Dependence

The coefficient of intrinsic dependence (CID) is capable of determining associations among variables without making distributional or functional assumptions regarding random variables. In this study, we developed the partial coefficient of intrinsic dependence (pCID) to facilitate the step-by-step selection of variables that are relevant to a target variable. The strategy of selecting relevant variables using the CID along with the pCID can eliminate interference from other relevant variables. From simulation results, we observed that the proposed method is more sensitive to curvilinearity and more specific to linearity than the combination of Pearsons correlation coefficient and the partial correlation coefficient (PCC/pPCC). This property may provide the opportunity to index different levels of curvilinearity according to CID/pCID outcomes. In practice trials conducted using publicly available microarray data, the CID/pCID procedure successfully identified cold-responsive genes related to three C-repeat binding factors, and was especially effective at identifying some sample-specific gene-gene interactions. Therefore, the proposed strategy may be beneficial in meta-analysis to distinguish general forms of relationships from the noise.

[1]  H. Silva,et al.  Isolation and functional characterization of cold-regulated promoters, by digitally identifying peach fruit cold-induced genes from a large EST dataset , 2009, BMC Plant Biology.

[2]  Li-Yu Daisy Liu Coefficient of intrinsic dependence: a new measure of association , 2005 .

[3]  Edward R. Dougherty,et al.  The coefficient of intrinsic dependence (feature selection using el CID) , 2005, Pattern Recognit..

[4]  L. Liu,et al.  Identifying Gene Set Association Enrichment Using the Coefficient of Intrinsic Dependence , 2013, PloS one.

[5]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[6]  Gordon K. Smyth,et al.  affylmGUI: a graphical user interface for linear modeling of single channel microarray data , 2006, Bioinform..

[7]  King-Jen Chang,et al.  In Silico Prediction for Regulation of Transcription Factors onTheir Shared Target Genes Indicates Relevant Clinical Implications in a Breast Cancer Population , 2012, Cancer informatics.

[8]  Zhou Du,et al.  agriGO: a GO analysis toolkit for the agricultural community , 2010, Nucleic Acids Res..

[9]  K. Shinozaki,et al.  Two Transcription Factors, DREB1 and DREB2, with an EREBP/AP2 DNA Binding Domain Separate Two Cellular Signal Transduction Pathways in Drought- and Low-Temperature-Responsive Gene Expression, Respectively, in Arabidopsis , 1998, Plant Cell.

[10]  R. Shibata,et al.  PARTIAL CORRELATION AND CONDITIONAL CORRELATION AS MEASURES OF CONDITIONAL INDEPENDENCE , 2004 .

[11]  Chien-Yu Chen,et al.  Statistical identification of gene association by CID in application of constructing ER regulatory network , 2009, BMC Bioinformatics.

[12]  T. Greene,et al.  Abiotic Stress Tolerance in Plants: An Industry Perspective , 2012 .

[13]  M. Thomashow,et al.  Arabidopsis Transcriptome Profiling Indicates That Multiple Regulatory Pathways Are Activated during Cold Acclimation in Addition to the CBF Cold Response Pathway Online version contains Web-only data. Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1 , 2002, The Plant Cell Online.

[14]  Jeffrey S. Racine,et al.  Nonparametric Econometrics: The np Package , 2008 .

[15]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[16]  Jian-Kang Zhu,et al.  The Arabidopsis Cold-Responsive Transcriptome and Its Regulation by ICE1w⃞ , 2005, The Plant Cell Online.

[17]  S. J. Gilmour,et al.  Arabidopsis Transcriptional Activators CBF1, CBF2, and CBF3 have Matching Functional Activities , 2004, Plant Molecular Biology.

[18]  Wen Huang,et al.  The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant , 2001, Nucleic Acids Res..

[19]  N. Singh,et al.  DREB1/CBF transcription factors: their structure, function and role in abiotic stress tolerance in plants , 2012, Journal of Genetics.

[20]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[21]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.