MSL: A Measure to Evaluate Three-dimensional Patterns in Gene Expression Data

Microarray technology is highly used in biological research environments due to its ability to monitor the RNA concentration levels. The analysis of the data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior. Biclustering relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. Triclustering appears for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. These triclusters provide hidden information in the form of behavior patterns from temporal experiments with microarrays relating subsets of genes, experimental conditions, and time points. We present an evaluation measure for triclusters called Multi Slope Measure, based on the similarity among the angles of the slopes formed by each profile formed by the genes, conditions, and times of the tricluster.

[1]  K. Tan,et al.  Finding Time-Lagged 3D Clusters , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[2]  Zhoujun Li,et al.  Multi-objective evolutionary algorithm for mining 3D clusters in gene-sample-time microarray data , 2008, 2008 IEEE International Conference on Granular Computing.

[3]  Christodoulos A. Floudas,et al.  Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures , 2008, BMC Bioinformatics.

[4]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[5]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[7]  Jugal K. Kalita,et al.  Triclustering in gene expression data analysis: A selected survey , 2011, 2011 2nd National Conference on Emerging Trends and Applications in Computer Science.

[8]  Lei Liu,et al.  Survey of Biodata Analysis from a Data Mining Perspective , 2005, Data Mining in Bioinformatics.

[9]  Kelvin Sim,et al.  Discovering Correlated Subspace Clusters in 3D Continuous-Valued Data , 2010, 2010 IEEE International Conference on Data Mining.

[10]  Pedro Mendes,et al.  GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems , 1993, Comput. Appl. Biosci..

[11]  K. Pearson,et al.  Mathematical Contributions to the Theory of Evolution. IV. On the Probable Errors of Frequency Constants and on the Influence of Random Selection on Variation and Correlation , .

[12]  Ziv Bar-Joseph,et al.  Analyzing time series gene expression data , 2004, Bioinform..

[13]  Guoren Wang,et al.  Efficiently Mining Time-Delayed Gene Expression Patterns , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[15]  Rocío Romero-Záliz,et al.  Classification of Gene Expression Profiles: Comparison of K-means and Expectation Maximization Algorithms , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[16]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[17]  José Cristóbal Riquelme Santos,et al.  TriGen: A genetic algorithm to mine triclusters in temporal gene expression data , 2014, Neurocomputing.

[18]  Anand Swaroop,et al.  A role for prenylated rab acceptor 1 in vertebrate photoreceptor development , 2012, BMC Neuroscience.

[19]  Stefano Pantaleoni,et al.  Bone Mineral Density at Diagnosis of Celiac Disease and after 1 Year of Gluten-Free Diet , 2014, TheScientificWorldJournal.

[20]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[21]  C. Spearman CORRELATION CALCULATED FROM FAULTY DATA , 1910 .

[22]  Zhen Hu,et al.  Algorithm for Discovering Low-Variance 3-Clusters from Real-Valued Datasets , 2010, 2010 IEEE International Conference on Data Mining.

[23]  F. Martínez-Álvarez,et al.  Earthquake prediction in seismogenic areas of the Iberian Peninsula based on computational intelligence , 2013 .

[24]  Martin Vingron,et al.  Ontologizer 2.0 - a multifunctional tool for GO term enrichment analysis and data exploration , 2008, Bioinform..

[25]  Federico Divina,et al.  Improved biclustering on expression data through overlapping control , 2009, Int. J. Intell. Comput. Cybern..

[26]  Vincent S. Tseng,et al.  A novel method for mining temporally dependent association rules in three-dimensional microarray datasets , 2010, 2010 International Computer Symposium (ICS2010).

[27]  Cristina Rubio-Escudero,et al.  Mining 3D Patterns from Gene Expression Temporal Data: A New Tricluster Evaluation Measure , 2014, TheScientificWorldJournal.

[28]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[29]  Shuigeng Zhou,et al.  gTRICLUSTER: A More General and Effective 3D Clustering Algorithm for Gene-Sample-Time Microarray Data , 2006, BioDM.

[30]  Cristina Rubio-Escudero,et al.  LSL: A new measure to evaluate triclusters , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[31]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[32]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[33]  Jan Koster,et al.  OTX2 directly activates cell cycle genes and inhibits differentiation in medulloblastoma cells , 2012, International journal of cancer.

[34]  Mohammed J. Zaki,et al.  TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[35]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[36]  Federico Divina,et al.  An effective measure for assessing the quality of biclusters , 2012, Comput. Biol. Medicine.