DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models