RLRBM: A Reinforcement Learning-based RAN Buffer Management Scheme