Hierarchical World Models as Visual Whole-Body Humanoid Controllers