Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation