Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions