Cross‐modal decoupling in temporal attention