Precision-recall space to correct external indices for biclustering

Biclustering is a major tool of data mining in many domains and many algorithms have emerged in recent years. All these algorithms aim to obtain coherent biclusters and it is crucial to have a reliable procedure for their validation. We point out the problem of size bias in biclustering evaluation and show how it can lead to wrong conclusions in a comparative study. We present the theoretical corrections for all of the most popular measures in order to remove this bias. We introduce the corrected precision-recall space that combines the advantages of corrected measures, the ease of interpretation and visualization of uncorrected measures. Numerical experiments demonstrate the interest of our approach.