LocATe: End-to-end Localization of Actions in 3D with Transformers