In many image and video processing applications, the ability to resize by a fractional factor, such as from 1080p to 720p, is essential. However, conventional CNN layers can only be used to alter the resolution of their inputs with integer scale factors. In this paper, we propose a downsampling network architecture that progressively reconstructs residuals at different scales. In particular, the aforementioned problem is solved by combining an upsampling sub-network and a downsampling subnetwork, both with integer scale factor. As an application, we apply the proposed downsampling network to an adaptive bitrate video streaming scenario. We extensively evaluate with different video codecs and upsampling algorithms to show the generality of our model. Our experimental results show that improvements in coding efficiency over the conventional Lanczos downsampling and state-of-the-art methods are attained, measured in different perceptual video quality models on large-resolution test videos.