Understanding weight-magnitude hyperparameters in training binary networks