Learning temporal information and object relation for zero-shot action recognition