Gradient-based Bi-level Optimization for Deep Learning: A Survey