Illuminant estimation error detection for outdoor scenes using transformers

Color constancy is an important property of the human visual system that allows us to recognize the colors of objects regardless of the scene illumination. Computational color constancy is an unavoidable part of all modern camera image processing pipelines. However, most modern computational color constancy methods focus on the estimation of only one illuminant per scene, even though the scene may have multiple illuminations, such as very common outdoor scenes illuminated by sunlight. In this work, we address this problem by creating a deep learning model for image segmentation based on the transformer architecture, which can identify regions in outdoor scenes where the global estimation and subsequent color correction of the image is not accurate. We compare our convolution-free model to a convolutional model and a more simple baseline model and achieve excellent results.