A Unified Multi-Task Semantic Communication System for Multimodal Data

Task-oriented semantic communication has achieved significant performance gains. However, the model has to be updated once the task is changed or multiple models need to be stored for serving different tasks. To address this issue, we develop a unified deep learning enabled semantic communication system (U-DeepSC), where a unified end-to-end framework can serve many different tasks with multiple modalities. As the difficulty varies from different tasks, different numbers of neural network layers are required for various tasks. We develop a multi-exit architecture in U-DeepSC to provide early-exit results for relatively simple tasks. To reduce the transmission overhead, we design a unified codebook for feature representation for serving multiple tasks, in which only the indices of these task-specific features in the codebook are transmitted. Moreover, we propose a dimension-wise dynamic scheme that can adjust the number of transmitted indices for different tasks as the number of required features varies from task to task. Furthermore, our dynamic scheme can adaptively adjust the numbers of transmitted features under different channel conditions to optimize the transmission efficiency. According to simulation results, the proposed U-DeepSC achieves comparable performance to the task-oriented semantic communication system designed for a specific task but with significant reduction in both transmission overhead and model size.

[1]  Geoffrey Y. Li,et al.  Robust Semantic Communications With Masked VQ-VAE Enabled Codebook , 2022, IEEE Transactions on Wireless Communications.

[2]  Zhongwei Si,et al.  Wireless Deep Video Semantic Transmission , 2022, IEEE Journal on Selected Areas in Communications.

[3]  Geoffrey Y. Li,et al.  Wireless Semantic Communications for Video Conferencing , 2022, IEEE Journal on Selected Areas in Communications.

[4]  Geoffrey Y. Li,et al.  Semantic Communications: Principles and Challenges , 2021, ArXiv.

[5]  K. B. Letaief,et al.  Task-Oriented Multi-User Semantic Communications , 2021, IEEE Journal on Selected Areas in Communications.

[6]  Deniz Gündüz,et al.  DeepWiVe: Deep-Learning-Aided Wireless Video Transmission , 2021, IEEE Journal on Selected Areas in Communications.

[7]  Ross B. Girshick,et al.  Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Wen Tong,et al.  Nine Challenges in Artificial Intelligence and Wireless Communications for 6G , 2021, IEEE Wireless Communications.

[9]  B. Ai,et al.  Wireless Image Transmission Using Deep Source Channel Coding With Attention Modules , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Wouter Van Gansbeke,et al.  Multi-Task Learning for Dense Prediction Tasks: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Xiaoming Tao,et al.  Deep Learning-Based Image Semantic Coding for Semantic Communications , 2021, 2021 IEEE Global Communications Conference (GLOBECOM).

[12]  Jiahui Li,et al.  Deep Joint Source-Channel Coding for Multi-Task Network , 2021, IEEE Signal Processing Letters.

[13]  Alexandros Iosifidis,et al.  Multi-Exit Vision Transformer for Dynamic Inference , 2021, BMVC.

[14]  Jiwen Lu,et al.  DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification , 2021, NeurIPS.

[15]  Orhan Arikan,et al.  Towards goal-oriented semantic signal processing: Applications and future challenges , 2021, Digit. Signal Process..

[16]  Zhijin Qin,et al.  Semantic Communication Systems for Speech Transmission , 2021, IEEE Journal on Selected Areas in Communications.

[17]  Guangming Shi,et al.  From Semantic Communication to Semantic-Aware Networking: Model, Architecture, and Open Problems , 2020, IEEE Communications Magazine.

[18]  Sergio Barbarossa,et al.  6G Networks: Beyond Shannon Towards Semantic and Goal-Oriented Communications , 2020, Comput. Networks.

[19]  Marios Kountouris,et al.  Semantics-Empowered Communication for Networked Intelligent Systems , 2020, IEEE Communications Magazine.

[20]  Deniz Gündüz,et al.  Wireless Image Retrieval at the Edge , 2020, IEEE Journal on Selected Areas in Communications.

[21]  Geoffrey Ye Li,et al.  Deep Learning Enabled Semantic Communication Systems , 2020, IEEE Transactions on Signal Processing.

[22]  Jimmy J. Lin,et al.  BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression , 2021, EACL.

[23]  Michael Crawshaw,et al.  Multi-Task Learning with Deep Neural Networks: A Survey , 2020, ArXiv.

[24]  Roger Zimmermann,et al.  MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis , 2020, ACM Multimedia.

[25]  K. Mikolajczyk,et al.  Joint Device-Edge Inference over Wireless Links with Pruning , 2020, 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[26]  David Burth Kurka,et al.  DeepJSCC-f: Deep Joint Source-Channel Coding of Images With Feedback , 2019, IEEE Journal on Selected Areas in Information Theory.

[27]  Yu-Chieh Chang,et al.  Deep Learning-Constructed Joint Transmission-Recognition for Internet of Things , 2019, IEEE Access.

[28]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[29]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[30]  Dmitry P. Vetrov,et al.  Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[31]  Mathieu Salzmann,et al.  Learning the Number of Neurons in Deep Networks , 2016, NIPS.

[32]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.