Application of deep reinforcement learning for spike sorting under multi-class imbalance