Neural spatial consistency of hierarchical vision models