Gain/variability tradeoffs in undiscounted Markov decision processes