HMMer is a widely-used bioinformatics software package that uses profile Hidden Markov Models (HMMs) to model the primary structure consensus of a family of protein or nucleic acid sequences. However, with the rapid growth of both sequence and model databases, it is more and more time-consuming to run HMMer on traditional computer architecture. With the development of modern field programmable gate array (FPGA) technology, applications can be accelerated using CPU-FPGA cooperative system by mapping computational-intensive work onto FPGA. In this paper, the computation kernel of HMMer, P7Viterbi, is selected to be accelerated by FPGA. After carefully data dependency analysis, we proposed a systolic array based reconfigurable architecture to exploit both inter-module and intra-module parallelism. There is an infrequent feedback loop in P7Viterbi to update the value of beginning state (B state), which limits further parallelization. Previous work either ignored the feedback loop or serialized the process, leading to loss of either precision or efficiency. Our proposed architecture can exploit maximum parallelism without loss of precision. The proposed architecture speculatively runs with fully parallelism assuming that the feedback loop does not take place. If the rare feedback case actually occurs, a rollback mechanism is used to ensure correctness. Results show that by using Xilinx Virtex-5 110T FPGA, the proposed architecture can achieve about a 56.8 times speedup compared with that of Intel Core2 Duo 2.33GHz CPU.