Graph Convolutional Network Upper Confident Bound

We formulate a new problem at the intersection of semi-supervised learning and contextual bandits, motivated by several applications including clinical trials and ad recommendations. We demonstrate how Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted to the new problem formulation. We also propose a variant of the linear contextual bandit with semi-supervised missing rewards imputation. We then take the best of both approaches to develop multi-GCN embedded contextual bandit. Our algorithms are verified on several real world datasets.