Pooling Transformer for Detection of Risk Events in In-The-Wild Video Ego Data
暂无分享,去创建一个
H. Amièva | Laura Middleton | J. Benois-Pineau | A. Zemmari | Rupayan Mallick | M. Pech | Thinhinane Yebda
[1] Jenny Benois-Pineau,et al. A GRU Neural Network with attention mechanism for detection of risk situations on multimodal lifelog data , 2021, 2021 International Conference on Content-Based Multimedia Indexing (CBMI).
[2] Stephen Lin,et al. Video Swin Transformer , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Xun Guo,et al. SSAN: Separable Self-Attention Network for Video Representation Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Ivan Marsic,et al. VidTr: Video Transformer Without Convolutions , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[5] Christoph Feichtenhofer,et al. Multiscale Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[6] Cordelia Schmid,et al. ViViT: A Video Vision Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] M. Ryoo,et al. Coarse-Fine Networks for Temporal Activity Detection in Videos , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Heng Wang,et al. Is Space-Time Attention All You Need for Video Understanding? , 2021, ICML.
[9] Jean-Baptiste Alayrac,et al. Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers , 2021, Transactions of the Association for Computational Linguistics.
[10] Pieter Abbeel,et al. Bottleneck Transformers for Visual Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[12] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[13] Thanos G. Stavropoulos,et al. IoT Wearable Sensors and Devices in Elderly Care: A Literature Review , 2020, Sensors.
[14] Christoph Feichtenhofer,et al. X3D: Expanding Architectures for Efficient Video Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Noel E. O'Connor,et al. HealthMedia'19: 4th International Workshop on Multimedia for Personal Health and Health Care , 2019, ACM Multimedia.
[16] Jenny Benois-Pineau,et al. Multi-sensing of fragile persons for risk situation detection: devices, methods, challenges , 2019, 2019 International Conference on Content-Based Multimedia Indexing (CBMI).
[17] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[18] Farhaan Mirza,et al. A Systematic Review of Wearable Sensors and IoT-Based Monitoring Applications for Older Adults – a Focus on Ageing Population and Independent Living , 2019, Journal of Medical Systems.
[19] Heng Wang,et al. Video Classification With Channel-Separated Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Thobias Sando,et al. GIS-based Spatial and Temporal Analysis of Aging-Involved Accidents: a Case Study of Three Counties in Florida , 2017 .
[21] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[22] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[23] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] U. Lindemann,et al. Sit-to-Stand Transition Reveals Acute Fall Risk in Activities of Daily Living , 2016, IEEE Journal of Translational Engineering in Health and Medicine.
[25] Georgios Meditskos,et al. Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Tao Mei,et al. Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation , 2016, ICMR.
[27] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[28] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[29] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[30] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] H. Amièva,et al. Frailty among community-dwelling elderly people in France: the three-city study. , 2008, The journals of gerontology. Series A, Biological sciences and medical sciences.
[32] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[33] Thinhinane Yebda,et al. Multimodal Sensor Data Analysis for Detection of Risk Situations of Fragile People in @home Environments , 2021, MMM.
[34] Mufti Mahmud,et al. Machine Learning Based Early Fall Detection for Elderly People with Neurological Disorder Using Multimodal Data Fusion , 2020, BI.
[35] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.