Cooperation-eliciting prisoner's dilemma payoffs for reinforcement learning agents

This work considers a stateless Q-learning agent in iterated Prisoner's Dilemma(PD). We have already given a condition of PD payoffs and Q-learning parameters that helps stateless Q-learning agents cooperate with each other. That condition, however, has a restrictive premise. This work relaxes the premise and shows a new payoff condition for mutual cooperation. After that, we derive the payoff relations that will elicit mutual cooperation from the new condition.