Multi-energy computed tomography (CT) is an emerging medical image modality with a number of potential applications in diagnosis and therapy. However, high system cost and technical barriers obstruct its step into routine clinical practice. In this study, we propose a framework to realize multi-energy cone beam CT (ME-CBCT) on the CBCT system that is widely available and has been routinely used for radiotherapy image guidance. In our method, a kVp switching technique is realized, which acquires x-ray projections with kVp levels cycling through a number of values. For this kVp-switching based ME-CBCT acquisition, x-ray projections of each energy channel are only a subset of all the acquired projections. This leads to an undersampling issue, posing challenges to the reconstruction problem. We propose a spatial spectral non-local means (ssNLM) method to reconstruct ME-CBCT, which employs image correlations along both spatial and spectral directions to suppress noisy and streak artifacts. To address the intensity scale difference at different energy channels, a histogram matching method is incorporated. Our method is different from conventionally used NLM methods in that spectral dimension is included, which helps to effectively remove streak artifacts appearing at different directions in images with different energy channels. Convergence analysis of our algorithm is provided. A comprehensive set of simulation and real experimental studies demonstrate feasibility of our ME-CBCT scheme and the capability of achieving superior image quality compared to conventional filtered backprojection-type (FBP) and NLM reconstruction methods.