Hierarchical representation learning for big complex multimedia data