Attention Mechanism based Cognition-level Scene Understanding