Distributed value function approximation for multi-agent reinforcement learning