Identification of promoter regions in genomic sequences by 1-dimensional constraint clustering

Size constrained clustering has been recently proposed to embed "a priori" knowledge in clustering methods. By exploiting the "string property" we propose an exact and efficient algorithm based on dynamic programming techniques to solve size-constrained one-dimensional clustering problems. We show the applicability of the proposed method in a difficult computational biology problem: the prediction of the transcription start sites of genes. The obtained experimental results clearly show the potential of the proposed approach when compared with previously published methods.