* Corresponding author. E-mail address: sko@nsc.liu.se ‡ Corresponding author. E-mail address: pborovska@tu-sofia.bg † Corresponding author. E-mail address: vgan@tu-sofia.bg Abstract This activity with the project PRACE-2IP is aimed to investigate and improve the performance of multiple sequence alignment software ClustalW on the supercomputer BlueGene/Q, so-called JUQUEEN, for the case study of the influenza virus sequences. Porting, tuning, profiling, and scaling of this code has been accomplished in this aspect. A parallel I/O interface has been designed for effcient sequence dataset input, in which sub-groups' local masters take care of read operation and broadcast the dataset to their slaves. The optimal group size has been investigated and the effects of read buffer size on read performance has been experimented. The application to ClustalW software shows that the current implementation with parallel I/O provides considerably better performance than the original code in view of I/O segment, leading up to 6.8 times speed-up for inputting dataset in case of using 8192 JUQUEEN cores.
[1]
Bernd Mohr,et al.
A scalable tool architecture for diagnosing wait states in massively parallel applications
,
2009,
Parallel Comput..
[2]
J. Thompson,et al.
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
,
1994,
Nucleic acids research.
[3]
Message Passing Interface Forum.
MPI: A message - passing interface standard
,
1994
.
[4]
N. Saitou,et al.
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
,
1987,
Molecular biology and evolution.
[5]
Kuo-Bin Li,et al.
ClustalW-MPI: ClustalW analysis using distributed and parallel computing
,
2003,
Bioinform..