Bi-MOCK: A Multi-objective Evolutionary Algorithm for Bi-clustering with Automatic Determination of the Number of Bi-clusters

Bi-clustering is one of the main tasks in data mining with many possible applications in bioinformatics, pattern recognition, text mining, just to cite a few. It refers to simultaneously partitioning a data matrix based on both rows and columns. One of the main issues in bi-clustering is the difficulty to find the number of bi-clusters, which is usually pre-specified by the human user. During the last decade, a new algorithm, called MOCK, has appeared and shown its performance in data clustering where the number of clusters is determined automatically. Motivated by the interesting results of MOCK, we propose in this paper a new algorithm, called Bi-MOCK, which could be seen as an extension of MOCK for bi-clustering. Like MOCK, Bi-MOCK uses the concept of multi-objective optimization and is able to find automatically the number of bi-clusters thanks to a newly proposed variable string length encoding scheme. The performance of our proposed algorithm is assessed on a set of real gene expression datasets. The comparative experiments show the merits and the outperformance of Bi-MOCK with respect to some existing recent works.