Discovery of Optimal Quantum Error Correcting Codes via Reinforcement Learning