Multi-agent Q-Learning control of spacecraft formation flying reconfiguration trajectories