Hierarchical Network with Label Embedding for Contextual Emotion Recognition