Improving the performance of the needleman-wunsch algorithm using parallelization and vectorization techniques