Parallel algorithms of large-scale EST clustering:current progress

Expressed sequence tag(EST) is a segment of cDNA,which contains some inheritable information of expressed gene.ESTs originating from the same gene can be grouped into one cluster according to their overlapping parts.Clustering is the necessary step to analyze expressed gene.Due to high computing complexity and large memory requirement,traditional serial computing methods cannot deal with large-scale EST clustering in common computing system.In this review,parallel processing methods and sustaining environments of software and hardware are introduced.Meanwhile,parallel algorithms and software that adapt to large-scale clustering are analyzed.In the end,by comparing computing speed and memory requirement of these algorithms,the advantages and disadvantages of each algorithm are discussed.