论文信息 - A Multi-label Multimodal Deep Learning Framework for Imbalanced Data Classification

A Multi-label Multimodal Deep Learning Framework for Imbalanced Data Classification

Social media and Web services have provided a notable number of multimedia content. Due to such explosion of multimedia data, the multimedia community has been facing new challenges and exciting opportunities these days. This paper presents a new multimedia framework to address some of the main challenges in this area. In particular, it presents a multi-label multimodal framework for imbalanced data classification. For this purpose, it utilizes audio, visual, and textual data modalities and automatically generates static and temporal features using spatio-temporal deep neural networks. It also manages data with non-uniform distributions using a weighted multi-label classifier. To evaluate this framework, a video dataset containing natural disasters is used for multi-label classification. The supremacy of the proposed framework compared to the existing work is revealed with extensive experiments on this dataset.

[1] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Min-Ling Zhang,et al. A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3] C. V. Jawahar,et al. Multi-label Cross-Modal Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[5] Shaogang Gong,et al. Imbalanced Deep Learning by Minority Class Incremental Rectification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Grigorios Tsoumakas,et al. Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[9] Yuan Jiang,et al. Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport , 2018, KDD.

[10] Fernando Bação,et al. Effective data generation for imbalanced learning using conditional generative adversarial networks , 2018, Expert Syst. Appl..

[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Mehrdad Nourani,et al. Predicting Drug-Target Interaction Using Deep Matrix Factorization , 2018, 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[13] Shu-Ching Chen,et al. Dynamic Sampling in Convolutional Neural Networks for Imbalanced Data Classification , 2018, 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).

[14] Shu-Ching Chen,et al. Deep Spatio-Temporal Representation Learning for Multi-Class Imbalanced Data Classification , 2018, 2018 IEEE International Conference on Information Reuse and Integration (IRI).

[15] Shu-Ching Chen,et al. Multimodal deep representation learning for video classification , 2018, World Wide Web.

[16] Liang Wang,et al. Unconstrained Multimodal Multi-Label Learning , 2015, IEEE Transactions on Multimedia.

[17] Mehrdad Nourani,et al. Feature Selection to Predict Compound's Effect on Aging , 2018, BCB.

[18] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.

[19] Seong-Whan Lee,et al. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis , 2014, NeuroImage.

[20] Lars Schmidt-Thieme,et al. Cost-sensitive learning methods for imbalanced data , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[21] Grigorios Tsoumakas,et al. Random K-labelsets for Multilabel Classification , 2022 .

[22] Mei-Ling Shyu,et al. Multimodal deep learning based on multiple correspondence analysis for disaster management , 2018, World Wide Web.

[23] Shu-Ching Chen,et al. Enhancing Multimedia Imbalanced Concept Detection Using VIMP in Random Forests , 2016, 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI).

[24] Zhi-Hua Zhou,et al. ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..