A multi-objective biclustering algorithm based on fuzzy mathematics

Biclustering algorithm is to cluster in the horizontal and vertical directions simultaneously in matrix. This algorithm identifies a set of sub-matrix by adopting a greedy iterative strategy, which employs the mean squared residue to measure the element consistency of a sub-matrix. Biclustering algorithm is widely applied in large and complex data. However, different versions of biclustering algorithm always have the problem that with the increasing of data size, more irrelevant rows or columns are involved in clustering which results in the poor performance of clustering. Therefore, this paper proposes a new algorithm, which combines fuzzy member matrix and comprehensive evaluation in fuzzy mathematics with multi-objective optimization algorithm to improve the performance of biclustering algorithm. In order to validate the effectiveness of the new algorithm, the performance the new algorithm and other three mainstream algorithms are compared on three gene/protein expression datasets. The results show the new algorithm has better element consistency, and sub-matrix capacity than other algorithms.

[1]  Zhoujun Li,et al.  Multi-objective Particle Swarm Optimization Biclustering of Microarray Data , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.

[2]  Azlan Mohd Zain,et al.  The role of basic, modified and hybrid shuffled frog leaping algorithm on optimization problems: a review , 2015, Soft Comput..

[3]  Hong Yan,et al.  A fuzzy biclustering algorithm for social annotations , 2009, J. Inf. Sci..

[4]  Pradipta Maji,et al.  Possibilistic biclustering algorithm for discovering value-coherent overlapping δ-biclusters , 2015, Int. J. Mach. Learn. Cybern..

[5]  János Abonyi,et al.  Biclustering of High-throughput Gene Expression Data with Bicluster Miner , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[6]  Jugal K. Kalita,et al.  Shifting-and-Scaling Correlation Based Biclustering Algorithm , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  R. Altman,et al.  Whole-genome expression analysis: challenges beyond clustering. , 2001, Current opinion in structural biology.

[8]  Pablo Moscato,et al.  A Modern Introduction to Memetic Algorithms , 2010 .

[9]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[10]  Amedeo Napoli,et al.  Biclustering meets triadic concept analysis , 2013, Annals of Mathematics and Artificial Intelligence.

[11]  Ujjwal Maulik,et al.  Mining Quasi-Bicliques from HIV-1-Human Protein Interaction Network: A Multiobjective Biclustering Approach , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Xiaohui Hu,et al.  An Effective Biclustering Algorithm for Time-Series Gene Expression Data , 2014, ICMLC.

[13]  Saharon Rosset,et al.  Optimal Set Cover Formulation for Exclusive Row Biclustering of Gene Expression , 2014, Journal of Computer Science and Technology.

[14]  Yi Pan,et al.  A comparison of the functional modules identified from time course and static PPI network data , 2011, BMC Bioinformatics.

[15]  Lin Gao,et al.  ppiPre: predicting protein-protein interactions by combining heterogeneous features , 2013, BMC Systems Biology.

[16]  Chuan Yi Tang,et al.  A genetic algorithm-based boolean delay model of intracellular signal transduction in inflammation , 2011, BMC Bioinformatics.

[17]  Fang-Xiang Wu,et al.  Detecting protein complexes from active protein interaction networks constructed with dynamic gene expression profiles , 2013, Proteome Science.

[18]  Fabrício Olivetti de França,et al.  Multi-Objective Biclustering: When Non-dominated Solutions are not Enough , 2009, J. Math. Model. Algorithms.

[19]  El-Ghazali Talbi,et al.  Parallel Hybrid Metaheuristic for Multi-objective Biclustering in Microarray Data , 2012, IPDPS Workshops.

[20]  Nicolas Gillis,et al.  A continuous characterization of the maximum-edge biclique problem , 2014, J. Glob. Optim..

[21]  Béchir el Ayeb,et al.  Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs , 2010, Applied Intelligence.

[22]  Sushmita Mitra,et al.  Multi-objective evolutionary biclustering of gene expression data , 2006, Pattern Recognit..

[23]  Yi Pan,et al.  Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data , 2012, BMC Bioinformatics.

[24]  Francesco Masulli,et al.  Stability and Performances in Biclustering Algorithms , 2009, CIBB.

[25]  Chris North,et al.  The role of interactive biclusters in sensemaking , 2014, CHI.

[26]  Fang-Xiang Wu,et al.  Double-layer clustering method to predict protein complexes based on power-law distribution and protein sublocalization. , 2016, Journal of theoretical biology.

[27]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Yi Pan,et al.  A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Armando Blanco,et al.  Possibilistic approach for biclustering microarray data , 2007, Comput. Biol. Medicine.

[30]  Haifa Ben Saber,et al.  Block Mixture Model for the Biclustering of Microarray Data , 2011, 2011 22nd International Workshop on Database and Expert Systems Applications.

[31]  Alan Wee-Chung Liew,et al.  A Hough Transform-Based Biclustering Algorithm for Gene Expression Data , 2014, ICMLC.

[32]  A. Tripathy,et al.  Cancer detection using biclustering , 2013, 2013 International Conference on Computer Communication and Informatics.

[33]  Philip S. Yu,et al.  WF-MSB: A weighted fuzzy-based biclustering method for gene expression data , 2011, Int. J. Data Min. Bioinform..

[34]  Seokho Lee,et al.  A biclustering algorithm for binary matrices based on penalized Bernoulli likelihood , 2014, Stat. Comput..

[35]  Blaise Hanczar,et al.  Using the bagging approach for biclustering of gene expression data , 2011, Neurocomputing.

[36]  Daniel Dahlmeier,et al.  A Biclustering-Based Classification Framework for Microarray Analysis , 2014, PAKDD Workshops.

[37]  Faris Alqadah,et al.  Biclustering neighborhood-based collaborative filtering method for top-n recommender systems , 2015, Knowledge and Information Systems.

[38]  Hend Bouziri,et al.  Evolutionary Biclustering Algorithm of Gene Expression Data , 2012, 2012 23rd International Workshop on Database and Expert Systems Applications.

[39]  Yangyang Li,et al.  Biclustering of gene expression data using Particle Swarm Optimization integrated with pattern-driven local search , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[40]  Lusheng Wang,et al.  Identification of Protein Complexes Using Weighted PageRank-Nibble Algorithm and Core-Attachment Structure , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[41]  Luonan Chen,et al.  Computational systems biology in the big data era , 2013, BMC Systems Biology.

[42]  Amiya Kumar Rath,et al.  Discovering non-exclusive functional modules from gene expression data , 2011, Int. J. Inf. Commun. Technol..

[43]  Dorit S. Hochbaum,et al.  Approximation Algorithms for a Minimization Variant of the Order-Preserving Submatrices and for Biclustering Problems , 2013, TALG.

[44]  Yi Pan,et al.  An effective method for refining predicted protein complexes based on protein activity and the mechanism of protein complex formation , 2013, BMC Systems Biology.

[45]  El-Ghazali Talbi,et al.  Preliminary Studies on Biclustering of GWA: A Multiobjective Approach , 2013, Artificial Evolution.

[46]  O. Erhun Kundakcioglu,et al.  Combinatorial Optimization in Data Mining , 2013 .

[47]  Sushmita Mitra,et al.  Evolutionary Fuzzy Biclustering of Gene Expression Data , 2007, RSKT.

[48]  Dongsheng Chen,et al.  Age-related trends in genetic parameters for Larix kaempferi and their implications for early selection , 2014, BMC Genetics.

[49]  Yi Pan,et al.  Construction and application of dynamic protein interaction network based on time course gene expression data , 2013, Proteomics.

[50]  J. Bi,et al.  Multi-view singular value decomposition for disease subtyping and genetic associations , 2014, BMC Genetics.