A Reinforcement Learning Based Online Coverage Path Planning Algorithm