Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method