Multi-Robot Inverse Reinforcement Learning Under O cclusion with State Transition Estimation (Extended Abstract)