Learning to Track Humans in Videos