Multi-armed bandit channel selection for power line communication

We consider a multi-channel power line communication (PLC) system, where channel coefficients follow non-identical log-normal distributions. We assume that statistical characteristics of each channel are time-variant, or in other words, channels are non-stationary. In such scenario, we formulate a channel selection problem, where a transmitter, provided with no prior information, aims at selecting the best channel among available PLC channels, so that the average utility, expressed in terms of data rate, is maximized. We cast the formulated channel selection problem as a piece-wise stationary multi-armed bandit game, and solve it by using algorithmic solutions. Numerical analysis establishes the applicability and effectiveness of our proposed model and solution.