SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
暂无分享,去创建一个
Kai-Wei Chang | Hung-yi Lee | Shang-Wen Li | W. Tseng | Hua Shen | Yu-Kai Wang | Iu-thing Kang | Yu-Kai Wang
[1] Hiroaki Hayashi,et al. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing , 2021, ACM Comput. Surv..
[2] W. Reif,et al. Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models , 2022, ArXiv.
[3] P. Bhattacharyya,et al. A Multimodal Corpus for Emotion Recognition in Sarcasm , 2022, LREC.
[4] Hung-yi Lee,et al. Structured Prompt Tuning , 2022, ArXiv.
[5] Tara N. Sainath,et al. Self-Supervised Speech Representation Learning: A Review , 2022, IEEE Journal of Selected Topics in Signal Processing.
[6] Kai-Wei Chang,et al. SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks , 2022, 2203.16773.
[7] M. Hasegawa-Johnson,et al. WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models , 2022, INTERSPEECH.
[8] S. Dubnov,et al. HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] A. Jansen,et al. Universal Paralinguistic Speech Representations Using self-Supervised Conformers , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Abdel-rahman Mohamed,et al. Text-Free Prosody-Aware Generative Spoken Language Modeling , 2021, ACL.
[11] Tomi Kinnunen,et al. Voxceleb Enrichment for Age and Gender Recognition , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[12] Chao-Han Huck Yang,et al. Voice2Series: Reprogramming Acoustic Models for Time Series Classification , 2021, ICML.
[13] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Tomi Kinnunen,et al. ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech , 2021, IEEE Transactions on Biometrics, Behavior, and Identity Science.
[15] Emmanuel Dupoux,et al. On Generative Spoken Language Modeling from Raw Audio , 2021, Transactions of the Association for Computational Linguistics.
[16] Roman Vygon,et al. Learning Efficient Representations for Keyword Spotting with Triplet Loss , 2021, SPECOM.
[17] Boris Ginsburg,et al. MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Yu Tsao,et al. A Study of Low-Resource Speech Commands Recognition based on Adversarial Reprogramming , 2021, ArXiv.
[19] Percy Liang,et al. Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.
[20] Dmitrij Šešok,et al. Unsupervised Pre-Training for Voice Activation , 2020, Applied Sciences.
[21] Philip John Gorinski,et al. Improving End-to-End Speech-to-Intent Classification with Reptile , 2020, INTERSPEECH.
[22] Afroz Ahamad,et al. AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition , 2020, LREC.
[23] Verónica Pérez-Rosas,et al. Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper) , 2019, ACL.
[24] Yoshua Bengio,et al. Speech Model Pre-training for End-to-End Spoken Language Understanding , 2019, INTERSPEECH.
[25] Pete Warden,et al. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition , 2018, ArXiv.
[26] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[27] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[28] Xavier Serra,et al. Freesound Datasets: A Platform for the Creation of Open Audio Datasets , 2017, ISMIR.
[29] Karol J. Piczak. ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.
[30] Hugo Van hamme,et al. Acquisition of ordinal words using weakly supervised NMF , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[31] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.