DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
暂无分享,去创建一个
[1] Xiatian Zhu,et al. Generative Semantic Segmentation , 2023, ArXiv.
[2] P. Luo,et al. DiffusionDet: Diffusion Model for Object Detection , 2022, ArXiv.
[3] T. Blundell,et al. Structure-based Drug Design with Equivariant Diffusion Models , 2022, ArXiv.
[4] Lingpeng Kong,et al. DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models , 2022, ICLR.
[5] David J. Fleet,et al. A Generalist Framework for Panoptic Segmentation of Images and Videos , 2022, ArXiv.
[6] Jong-Chul Ye,et al. Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation , 2022, ICLR.
[7] Yaniv Taigman,et al. Make-A-Video: Text-to-Video Generation without Text-Video Data , 2022, ICLR.
[8] Stan Z. Li,et al. A Survey on Generative Diffusion Model , 2022, ArXiv.
[9] Mao Ye,et al. Diffusion-based Molecule Generation with Informative Prior Bridges , 2022, NeurIPS.
[10] Zhongang Cai,et al. MotionDiffuse: Text-Driven Human Motion Generation With Diffusion Model , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Yu-Chiang Frank Wang,et al. Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis , 2022, AAAI.
[12] Yuanzhen Li,et al. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Geoffrey E. Hinton,et al. Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning , 2022, ICLR.
[14] A. Yuille,et al. In Defense of Online Models for Video Instance Segmentation , 2022, ECCV.
[15] Chao Weng,et al. Diffsound: Discrete Diffusion Model for Text-to-Sound Generation , 2022, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Xiatian Zhu,et al. Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning , 2022, ECCV.
[17] Jing Zhang,et al. ReAct: Temporal Action Detection with Relational Queries , 2022, ECCV.
[18] Yi Ren,et al. ProDiff: Progressive Fast Diffusion Model for High-Quality Text-to-Speech , 2022, ACM Multimedia.
[19] D. Samaras,et al. Diffusion models as plug-and-play priors , 2022, NeurIPS.
[20] Brian L. Trippe,et al. Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem , 2022, ICLR.
[21] L. Wolf,et al. Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models , 2022, Interspeech.
[22] Emmanuel Asiedu Brempong,et al. Denoising Pretraining for Semantic Segmentation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[23] T. Jaakkola,et al. Torsional Diffusion for Molecular Conformer Generation , 2022, NeurIPS.
[24] Sung-Hoon Yoon,et al. Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data , 2022, ArXiv.
[25] Xiang Lisa Li,et al. Diffusion-LM Improves Controllable Text Generation , 2022, NeurIPS.
[26] Tudor Achim,et al. Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models , 2022, ArXiv.
[27] Frank Wood,et al. Flexible Diffusion Modeling of Long Videos , 2022, NeurIPS.
[28] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[29] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[30] David J. Fleet,et al. Video Diffusion Models , 2022, NeurIPS.
[31] Victor Garcia Satorras,et al. Equivariant Diffusion for Molecule Generation in 3D , 2022, ICML.
[32] S. Mandt,et al. Diffusion Probabilistic Modeling for Video Generation , 2022, Entropy.
[33] Pan Pan,et al. RCL: Recurrent Continuous Localization for Temporal Action Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] S. Ermon,et al. GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation , 2022, ICLR.
[35] L. Ni,et al. DN-DETR: Accelerate DETR Training by Introducing Query DeNoising , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Yin Li,et al. ActionFormer: Localizing Moments of Actions with Transformers , 2022, ECCV.
[37] Hang Su,et al. DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR , 2022, ICLR.
[38] Prafulla Dhariwal,et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.
[39] A. Voynov,et al. Label-Efficient Semantic Segmentation with Diffusion Models , 2021, ICLR.
[40] Philippe C. Cattin,et al. Diffusion Models for Implicit Image Segmentation Ensembles , 2021, MIDL.
[41] Lior Wolf,et al. SegDiff: Image Segmentation with Diffusion Probabilistic Models , 2021, ArXiv.
[42] Fang Wen,et al. Vector Quantized Diffusion Model for Text-to-Image Synthesis , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] D. Lischinski,et al. Blended Diffusion for Text-driven Editing of Natural Images , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Tao Xiang,et al. Few-Shot Temporal Action Localization with Query Adaptive Transformer , 2021, BMVC.
[45] Vincent Lepetit,et al. 1st Place Solution for the UVO Challenge on Image-based Open-World Segmentation 2021 , 2021, ArXiv.
[46] Taesu Kim,et al. EdiTTS: Score-based Editing for Controllable Text-to-Speech , 2021, INTERSPEECH.
[47] Yingming Wang,et al. Anchor DETR: Query Design for Transformer-Based Object Detection , 2021, 2109.07107.
[48] Niamul Quader,et al. Class Semantics-based Attention for Action Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[49] Gang Hua,et al. Enriching Local and Global Contexts for Temporal Action Localization , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[50] Zeming Li,et al. YOLOX: Exceeding YOLO Series in 2021 , 2021, ArXiv.
[51] Rianne van den Berg,et al. Structured Denoising Diffusion Models in Discrete State-Spaces , 2021, NeurIPS.
[52] Hongxun Yao,et al. Temporal Action Proposal Generation with Transformers , 2021, ArXiv.
[53] Ziqiang Shi,et al. It\^oTTS and It\^oWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation , 2021, 2105.07583.
[54] Tasnima Sadekova,et al. Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech , 2021, ICML.
[55] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[56] Bernard Ghanem,et al. Low-Fidelity Video Encoder Optimization for Temporal Action Localization , 2021, NeurIPS.
[57] Zeming Li,et al. OTA: Optimal Transport Assignment for Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Wei Wu,et al. Temporal Context Aggregation Network for Temporal Action Proposal Refinement , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Prafulla Dhariwal,et al. Improved Denoising Diffusion Probabilistic Models , 2021, ICML.
[60] Limin Wang,et al. Relaxed Transformer Decoders for Direct Action Proposal Generation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[61] Song Bai,et al. Multi-shot Temporal Event Localization: a Benchmark , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.
[63] Yi Jiang,et al. Sparse R-CNN: End-to-End Object Detection with Learnable Proposals , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[64] Bernard Ghanem,et al. TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks , 2020, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
[65] T. Xiang,et al. Boundary-sensitive Pre-training for Temporal Localization in Videos , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[66] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[67] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2020, ICLR.
[68] Wei Wu,et al. BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation , 2020, AAAI.
[69] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[70] Stefano Ermon,et al. Improved Techniques for Training Score-Based Generative Models , 2020, NeurIPS.
[71] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[72] Yuning Jiang,et al. SOLO: Segmenting Objects by Locations , 2019, ECCV.
[73] Ali K. Thabet,et al. G-TAD: Sub-Graph Localization for Temporal Action Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Shilei Wen,et al. BMN: Boundary-Matching Network for Temporal Action Proposal Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[75] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.
[76] Tao Mei,et al. Gaussian Temporal Awareness Networks for Action Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[77] Hao Chen,et al. FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[78] Ming Yang,et al. BSN: Boundary Sensitive Network for Temporal Action Proposal Generation , 2018, ECCV.
[79] Rahul Sukthankar,et al. Rethinking the Faster R-CNN Architecture for Temporal Action Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[80] Carsten Rother,et al. Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[81] Bernard Ghanem,et al. SST: Single-Stream Temporal Action Proposals , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[82] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[83] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[84] Limin Wang,et al. Temporal Action Detection with Structured Segment Networks , 2017, International Journal of Computer Vision.
[85] Kate Saenko,et al. R-C3D: Region Convolutional 3D Network for Temporal Activity Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[86] R. Nevatia,et al. TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[87] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[88] Haroon Idrees,et al. The THUMOS challenge on action recognition for videos "in the wild" , 2016, Comput. Vis. Image Underst..
[89] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[90] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[91] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.