Learning Structured Reactive Navigation Plans from Executing MDP policies