In Defense of Pure 16-bit Floating-Point Neural Networks