Delay Minimization in Multi-UAV Assisted Wireless Networks: A Reinforcement Learning Approach