Simultaneous Clustering and Feature Weighting Using Multiobjective Optimization for Identifying Functionally Similar miRNAs

MicroRNAs (miRNAs) are a type of RNAs, which are responsible for monitoring the gene expression values. Recent research asserts that miRNAs form some clustering on chromosomes. The miRNAs belonging to a particular cluster are highly similar in terms of their activity and they are termed as “coregulated” miRNAs. The current paper presents an approach that simultaneously performs two tasks: i) clustering of miRNAs into different categories based on some similarity measures ii) identification of proper weight values for different time points with respect to which expression values are available. In general, a large number of expression values are available for a given miRNA data set. All these values may not be suitable to be used equally to measure the similarity between two miRNAs. In the current study, the problem of proper selection of weight values for different time points and then determining the proper partitioning from the given miRNA data set utilizing the similarity computed using the new set of weight values is formulated as an optimization problem where several cluster validity indices are optimized as the goodness measures. To that end, a multiobjective differential evolution based optimization technique is utilized. The supremacy of the proposed technique is tested on three miRNA data sets in comparison to some recent approaches in terms of some popular performance measures like Silhouette index and DB-index. The observations are further supported by statistical and biological significance tests. Supplementary information is available at https://www.iitp.ac.in/~sriparna/journals.html.

[1]  Ujjwal Maulik,et al.  Multiobjective Genetic Clustering for Pixel Classification in Remote Sensing Imagery , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[2]  R. Sharan,et al.  CLICK: a clustering algorithm with applications to gene expression analysis. , 2000, Proceedings. International Conference on Intelligent Systems for Molecular Biology.

[3]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[4]  Kenneth V. Price,et al.  An introduction to differential evolution , 1999 .

[5]  Sankar K. Pal,et al.  RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets , 2007, Fundam. Informaticae.

[6]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[7]  Ujjwal Maulik,et al.  An improved algorithm for clustering gene expression data , 2007, Bioinform..

[8]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[9]  Jaya Sil,et al.  Simultaneous feature selection and clustering with mixed features by multi objective genetic algorithm , 2014, Int. J. Hybrid Intell. Syst..

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[12]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  L. Gribaldo,et al.  MicroRNA profiling as a tool for pathway analysis in a human in vitro model for neural development. , 2012, Current medicinal chemistry.

[14]  Soon-H. Kwon Cluster validity index for fuzzy clustering , 1998 .

[15]  Sudipta Acharya,et al.  Multiobjective Simulated Annealing-Based Clustering of Tissue Samples for Cancer Diagnosis , 2016, IEEE Journal of Biomedical and Health Informatics.

[16]  Lingling Hu,et al.  miRClassify: An advanced web server for miRNA family classification and annotation , 2014, Comput. Biol. Medicine.

[17]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[19]  Ujjwal Maulik,et al.  A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA , 2008, IEEE Transactions on Evolutionary Computation.

[20]  Shuigeng Zhou,et al.  miRFam: an effective automatic miRNA classification method based on n-grams and a multiclass SVM , 2011, BMC Bioinformatics.

[21]  Jian Zhuang,et al.  Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data , 2013, Pattern Recognit..

[22]  B. Babu,et al.  Differential evolution for multi-objective optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[23]  Pradipta Maji,et al.  City block distance and rough-fuzzy clustering for identification of co-expressed microRNAs. , 2014, Molecular bioSystems.

[24]  Walter J. Lukiw,et al.  Nearest hyperplane distance neighbor clustering algorithm applied to gene co-expression analysis in Alzheimer's disease , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[25]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[26]  Sudipta Acharya,et al.  Identifying Co-expressed miRNAs using Multiobjective Optimization , 2014, 2014 International Conference on Information Technology.

[27]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[28]  Sudipta Acharya,et al.  Importance of proximity measures in clustering of cancer and miRNA datasets: proposal of an automated framework. , 2016, Molecular bioSystems.

[29]  Yi Jing,et al.  Analysis of 13 cell types reveals evidence for the expression of numerous novel primate- and tissue-specific microRNAs , 2015, Proceedings of the National Academy of Sciences.

[30]  Sanghamitra Bandyopadhyay,et al.  Gene expression data clustering using a multiobjective symmetry based clustering technique , 2013, Comput. Biol. Medicine.

[31]  Alessandra Carbone,et al.  CLAG: an unsupervised non hierarchical clustering algorithm handling biological data , 2012, BMC Bioinformatics.

[32]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Sanghamitra Bandyopadhyay,et al.  Unsupervised Classification: Similarity Measures, Classical and Metaheuristic Approaches, and Applications , 2012 .

[34]  Roded Sharan,et al.  Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis , 2000, ISMB.