Augmenting max-weight with explicit learning for wireless scheduling with switching costs