Multi-Agent Deep Reinforcement Learning Based Resource Management in SWIPT Enabled Cellular Networks with H2H/M2M Co-Existence