Long-term, continuous physiological recordings are currently being intensely investigated for tracking emotions. Emotional valence has been of more interest due to its relevance to cardiac and neurophysiological disease. In this research, multiple configurable convolutional neural networks (CNNs) were developed for different image-encoding techniques used as their input. Ensemble classification was then used to achieve a combined performance of the multiple CNNs by training a simple support vector machine (SVM) classifier using the last output layers of the CNNs as its input. Valence-labelled signals from the heart rate (HR) recorded using a wearable sensor from a wristband in a daily setting for one week from 80 participants were used for the image transforms. Accuracies of more than 91% were achieved with the classification ensembling, showing an improvement of the binary classification of emotional valence by more than 19% compared to using CNNs on their own.