Reinforcement Learning Policy Recommendation for Interbank Network Stability