Supervised Cluster Analysis of miRNA Expression Data Using Rough Hypercuboid Partition Matrix

The microRNAs are small, endogenous non-coding RNAs found in plants and animals, which suppresses the expression of genes post-transcriptionally. It is suggested by various genome-wide studies that a substantial fraction of miRNA genes is likely to form clusters. The coherent expression of the miRNA clusters can then be used to classify samples according to the clinical outcome. In this background, a new rough hypercuboid based supervised similarity measure is proposed that is integrated with the supervised attribute clustering to find groups of miRNAs whose coherent expression can classify samples. The proposed method directly incorporates the information of sample categories into the miRNA clustering process, generating a supervised clustering algorithm for miRNAs. The effectiveness of the rough hypercuboid based algorithm, along with a comparison with other related algorithms, is demonstrated on three miRNA microarray expression data sets using the \(B.632+\) bootstrap error rate of support vector machine. The association of the miRNA clusters to various biological pathways are also shown by doing pathway enrichment analysis.

[1]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[2]  Pradipta Maji,et al.  Fuzzy–Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Jin-Mao Wei,et al.  Ensemble Rough Hypercuboid Approach for Classifying Cancers , 2010, IEEE Transactions on Knowledge and Data Engineering.

[4]  Pradipta Maji,et al.  City block distance and rough-fuzzy clustering for identification of co-expressed microRNAs. , 2014, Molecular bioSystems.

[5]  Peter Bühlmann,et al.  Supervised clustering of genes , 2002, Genome Biology.

[6]  Chun-Nan Hsu,et al.  MetaMirClust: discovery of miRNA cluster patterns using a data-mining approach. , 2012, Genomics.

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Pradipta Maji,et al.  A Rough Hypercuboid Approach for Feature Selection in Approximation Spaces , 2014, IEEE Transactions on Knowledge and Data Engineering.

[9]  D. Bartel,et al.  Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. , 2005, RNA.

[10]  R. Tibshirani,et al.  Supervised harvesting of expression trees , 2001, Genome Biology.

[11]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[12]  Israel Steinfeld,et al.  miRNA-mRNA Integrated Analysis Reveals Roles for miRNAs in Primary Breast Tumors , 2011, PloS one.

[13]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[14]  Ash A. Alizadeh,et al.  'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns , 2000, Genome Biology.

[15]  Pradipta Maji,et al.  μHEM for identification of differentially expressed miRNAs using hypercuboid equivalence partition matrix , 2013, BMC Bioinformatics.

[16]  Martin Reczko,et al.  DIANA miRPath v.2.0: investigating the combinatorial effect of microRNAs in pathways , 2012, Nucleic Acids Res..

[17]  Hanah Margalit,et al.  Clustering and conservation patterns of human microRNAs , 2005, Nucleic acids research.

[18]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..

[19]  Vinod Scaria,et al.  Consensus miRNA expression profiles derived from interplatform normalization of microarray data. , 2010, RNA.