Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking
暂无分享,去创建一个
Radu Horaud | Laurent Girin | Xavier Alameda-Pineda | Yutong Ban | R. Horaud | Yutong Ban | Xavier Alameda-Pineda | Laurent Girin
[1] H.K. Ekenel,et al. Kalman filters for audio-video source localization , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..
[2] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[3] Jean-Marc Odobez,et al. Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[4] V. Šmídl,et al. The Variational Bayes Method in Signal Processing , 2005 .
[5] Radu Horaud,et al. High-dimensional regression with gaussian mixtures and partially-latent response variables , 2013, Statistics and Computing.
[6] Radu Horaud,et al. Tracking Multiple Persons Based on a Variational Bayesian Model , 2016, ECCV Workshops.
[7] Radu Horaud,et al. A Variational EM Algorithm for the Separation of Time-Varying Convolutive Audio Mixtures , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Emmanuel Vincent,et al. A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Josef Kittler,et al. Mean-Shift and Sparse Sampling-Based SMC-PHD Filtering for Audio Informed Visual Speaker Tracking , 2016, IEEE Transactions on Multimedia.
[10] Radu Horaud,et al. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Josef Kittler,et al. Audio Assisted Robust Visual Tracking With Adaptive Particle Filtering , 2015, IEEE Transactions on Multimedia.
[12] Radu Horaud,et al. Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Radu Horaud,et al. An EM algorithm for joint source separation and diarisation of multichannel convolutive speech mixtures , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Javier Ramírez,et al. Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..
[15] Muhammad Salman Khan,et al. Multimodal (audio-visual) source separation exploiting multi-speaker tracking, robust beamforming and time-frequency masking , 2012, IET Signal Process..
[16] Radu Horaud,et al. Vision-guided robot hearing , 2013, Int. J. Robotics Res..
[17] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Yang Liu,et al. Non-Zero Diffusion Particle Flow SMC-PHD Filter for Audio-Visual Multi-Speaker Tracking , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] James R. Hopgood,et al. Person tracking via audio and video fusion , 2012 .
[20] Radu Horaud,et al. EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[21] C. Zetzsche,et al. Information-Driven Active Audio-Visual Source Localization , 2015, PloS one.
[22] Andrea Cavallaro,et al. 3D audio-visual speaker tracking with an adaptive particle filter , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).