Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
暂无分享,去创建一个
Dong Chen | Dejing Dou | Lichao Mou | Xuhong Li | Liping Jing | Pu Jin | Di Hu | Xiaoxiang Zhu | D. Dou | L. Jing | Di Hu | Lichao Mou | P. Jin | Xiaoxiang Zhu | Xuhong Li | Dong Chen
[1] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[2] Luisa Verdoliva,et al. Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.
[3] Xiao Xiang Zhu,et al. A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[5] Ruslan Salakhutdinov,et al. Cross-Task Knowledge Transfer for Visually-Grounded Navigation , 2018 .
[6] Tong Zhang,et al. Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.
[7] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Lei Guo,et al. When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs , 2018, IEEE Transactions on Geoscience and Remote Sensing.
[9] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[10] Vladimir Risojevic,et al. Aerial image classification using structural texture similarity , 2011, 2011 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).
[11] Ryosuke Yamanishi,et al. Sound Event Detection by Multitask Learning of Sound Events and Scenes with Soft Scene Labels , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Yong Jae Lee,et al. Audiovisual SlowFast Networks for Video Recognition , 2020, ArXiv.
[13] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[15] Tatsuya Harada,et al. Image Reconstruction from Bag-of-Visual-Words , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[16] Gui-Song Xia,et al. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.
[17] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[18] Chuang Gan,et al. Self-Supervised Moving Vehicle Tracking With Stereo Sound , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[19] Xuelong Li,et al. Deep Multimodal Clustering for Unsupervised Audiovisual Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Shawn D. Newsam,et al. Comparing SIFT descriptors and gabor texture features for classification of remote sensed imagery , 2008, 2008 15th IEEE International Conference on Image Processing.
[21] Chenliang Xu,et al. Audio-Visual Event Localization in Unconstrained Videos , 2018, ECCV.
[22] Z. Babic,et al. Orientation difference descriptor for aerial image classification , 2012, 2012 19th International Conference on Systems, Signals and Image Processing (IWSSIP).
[23] Andrew Owens,et al. Audio-Visual Scene Analysis with Self-Supervised Multisensory Features , 2018, ECCV.
[24] Jefersson Alex dos Santos,et al. Towards better exploiting convolutional neural networks for remote sensing scene classification , 2016, Pattern Recognit..
[25] Andrew Owens,et al. Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning , 2017, International Journal of Computer Vision.
[26] Xuelong Li,et al. Temporal Multimodal Learning in Audiovisual Speech Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Feiping Nie,et al. Curriculum Audiovisual Learning , 2020, ArXiv.
[28] Yihong Gong,et al. Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[29] Trevor Darrell,et al. Simultaneous Deep Transfer Across Domains and Tasks , 2015, ICCV.
[30] Kaiqi Huang,et al. Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[32] Antonio Torralba,et al. Anticipating Visual Representations from Unlabeled Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Bo Du,et al. Scene Classification via a Gradient Boosting Random Convolutional Network Framework , 2016, IEEE Transactions on Geoscience and Remote Sensing.
[34] Mohamed R. Amer,et al. Facial Attributes Classification Using Multi-task Representation Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[35] Chuang Gan,et al. The Sound of Pixels , 2018, ECCV.
[36] Andrzej Cichocki,et al. EmotionMeter: A Multimodal Framework for Recognizing Human Emotions , 2019, IEEE Transactions on Cybernetics.
[37] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[38] Scott Workman,et al. A Multimodal Approach to Mapping Soundscapes , 2018, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium.