Toward Training at ImageNet Scale with Differential Privacy