Learning With Guarantee Via Constrained Multi-Armed Bandit: Theory and Network Applications