Gradient descent efficiently learns positive definite deep linear residual networks