Spatial Chirp-Z Transformer Networks

Convolutional Neural Networks are often used for computer vision solutions, because of their inherent modeling of the translation invariance in images. In this paper, we propose a new module to model rotation and scaling invariances in images. To do this, we rely on the chirp-Z transform to perform the desired translation, rotation and scaling in the frequency domain. This approach has the benefit that it scales well and that it is differentiable because of the computationally cheap sincinterpolation.