Compressing convolutional neural networks with cheap convolutions and online distillation