Investigating the exploration-exploitation trade-off in dynamic environments with multiple agents