暂无分享,去创建一个
[1] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[2] R Devon Hjelm,et al. Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.
[3] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[4] Neil Zeghidour,et al. Contrastive Learning of General-Purpose Audio Representations , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] J. Lee,et al. Predicting What You Already Know Helps: Provable Self-Supervised Learning , 2020, NeurIPS.
[6] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[7] Vishrav Chaudhary,et al. Self-training Improves Pre-training for Natural Language Understanding , 2020, NAACL.
[8] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[9] Arthur Gretton,et al. Self-Supervised Learning with Kernel Dependence Maximization , 2021, NeurIPS.
[10] Abhinav Shukla,et al. Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision , 2020, ArXiv.
[11] Aren Jansen,et al. Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline , 2013, INTERSPEECH.
[12] Shinji Watanabe,et al. SUPERB: Speech processing Universal PERformance Benchmark , 2021, Interspeech.
[13] Yoshua Bengio,et al. Multi-Task Self-Supervised Learning for Robust Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Patrick Pérez,et al. Boosting Few-Shot Visual Learning With Self-Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Emmanuel Dupoux,et al. Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments , 2018, INTERSPEECH.
[16] Francis M. Tyers,et al. Common Voice: A Massively-Multilingual Speech Corpus , 2020, LREC.
[17] Hermann Ney,et al. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation , 2019, INTERSPEECH.
[18] Morgan Sonderegger,et al. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi , 2017, INTERSPEECH.
[19] Hynek Hermansky,et al. RASTA-PLP speech analysis technique , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[20] James R. Glass,et al. A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning , 2020, INTERSPEECH.
[21] Jon Sánchez,et al. Automatic emotion recognition using prosodic parameters , 2005, INTERSPEECH.
[22] Yi-Hsuan Yang,et al. Multitask Learning for Frame-level Instrument Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Hagen Soltau,et al. Joint Speech Recognition and Speaker Diarization via Sequence Transduction , 2019, INTERSPEECH.
[24] Titouan Parcollet,et al. LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech , 2021, Interspeech 2021.
[25] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Emmanuel Dupoux,et al. Evaluating the reliability of acoustic speech embeddings , 2020, INTERSPEECH.
[27] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[28] Guangsen Wang,et al. Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks , 2020, INTERSPEECH.
[29] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[30] Ramón Fernández Astudillo,et al. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.
[31] Ruslan Salakhutdinov,et al. Hubert: How Much Can a Bad Teacher Benefit ASR Pre-Training? , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.
[33] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[34] Rajen Dinesh Shah,et al. The hardness of conditional independence testing and the generalised covariance measure , 2018, The Annals of Statistics.
[35] Kun Han,et al. Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning , 2021, Interspeech 2021.
[36] Johan Sundberg,et al. Effects of vocal loudness variation on spectrum balance as reflected by the alpha measure of long-term-average spectra of speech. , 2006, The Journal of the Acoustical Society of America.
[37] Chao Wang,et al. Multi-Task Self-Supervised Pre-Training for Music Classification , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Emmanuel Dupoux,et al. On Generative Spoken Language Modeling from Raw Audio , 2021, Transactions of the Association for Computational Linguistics.
[39] Colin Wei,et al. Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data , 2020, ICLR.
[40] Hung-yi Lee,et al. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[41] Meng Li,et al. Exploring wav2vec 2.0 on speaker verification and language identification , 2020, Interspeech.
[42] Le Song,et al. A Kernel Statistical Test of Independence , 2007, NIPS.
[43] Titouan Parcollet,et al. Conditional independence for pretext task selection in Self-supervised speech representation learning , 2021, Interspeech 2021.
[44] Yves Grandvalet,et al. More efficiency in multiple kernel learning , 2007, ICML '07.
[45] Andrew Zisserman,et al. Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[46] Yoshua Bengio,et al. Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks , 2019, INTERSPEECH.
[47] Kshitij Dwivedi,et al. Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Björn W. Schuller,et al. Comparing one and two-stage acoustic modeling in the recognition of emotion in speech , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[49] Aren Jansen,et al. Rapid Evaluation of Speech Representations for Spoken Term Discovery , 2011, INTERSPEECH.
[50] In-So Kweon,et al. Learning Image Representations by Completing Damaged Jigsaw Puzzles , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[51] Ronan Collobert,et al. Unsupervised Cross-lingual Representation Learning for Speech Recognition , 2020, Interspeech.
[52] Alex Graves,et al. Connectionist Temporal Classification , 2012 .
[53] Joon Son Chung,et al. VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.
[54] Cordelia Schmid,et al. What makes for good views for contrastive learning , 2020, NeurIPS.
[55] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[56] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[57] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[58] Andrew Zisserman,et al. Objects that Sound , 2017, ECCV.
[59] Sanjeev Arora,et al. A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks , 2021, ICLR.
[60] Sergey Ioffe,et al. Probabilistic Linear Discriminant Analysis , 2006, ECCV.
[61] Mohammad Norouzi,et al. Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.
[62] Gunnar Rätsch,et al. Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..
[63] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.
[64] Fuhui Long,et al. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[65] Shoichiro Takeda,et al. Multiple Pretext-Task for Self-Supervised Learning via Mixing Multiple Image Transformations , 2019, ArXiv.
[66] John R. Hershey,et al. Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks , 2015, INTERSPEECH.
[67] Aren Jansen,et al. A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge , 2015, INTERSPEECH.
[68] Carlos Busso,et al. IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.
[69] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[70] Mikhail Khodak,et al. A Theoretical Analysis of Contrastive Unsupervised Representation Learning , 2019, ICML.
[71] Yingli Tian,et al. Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[72] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[73] Peter J. Murphy,et al. Cepstrum-Based Harmonics-to-Noise Ratio Measurement in Voiced Speech , 2004, Summer School on Neural Networks.
[74] James R. Glass,et al. Unsupervised Methods for Evaluating Speech Representations , 2020, INTERSPEECH.
[75] Quoc V. Le,et al. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition , 2020, ArXiv.
[76] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.
[77] Gaël Richard,et al. Acoustic Features for Environmental Sound Analysis , 2018 .
[78] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[79] Xin Wang,et al. TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[80] Laurens van der Maaten,et al. Self-Supervised Learning of Pretext-Invariant Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[81] Daniel Garcia-Romero,et al. Time delay deep neural network-based universal background models for speaker recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[82] Michael Tschannen,et al. On Mutual Information Maximization for Representation Learning , 2019, ICLR.
[83] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[84] To all authors , 1995 .
[85] Andreas Stolcke,et al. Wav2vec-C: A Self-supervised Model for Speech Representation Learning , 2021, Interspeech.
[86] Aixia Guo,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2014 .