Assessment of the parallelization approach of d2_cluster for high‐performance sequence clustering

The exponential increase in expressed sequence tag (EST) sequence data amplifies the computational cost of clustering sequences such that new algorithms are required to analyze data at a greater rate. We have parallelized d2_cluster on a SGI Origin 2000 multiprocessor and observed a speedup of approximately 100× on 126 processors when processing a 15,876 EST dataset. The parallelized d2_cluster code is obtainable from the SANBI website (http://www.sanbi.ac.za/CODES). © 2002 Wiley Periodicals, Inc. J Comput Chem 23: 755–757, 2002