A multi-agent reinforcement learning approach for investigating and optimising peer-to-peer prosumer energy markets