Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation
暂无分享,去创建一个
Emanuel A. P. Habets | Soumitro Chakrabarty | Bernd Edler | Fabian-Robert Stöter | Fabian-Robert Stöter | B. Edler | Emanuël Habets | Soumitro Chakrabarty
[1] Srinivas S. Kruthiventi,et al. CrowdNet: A Deep Convolutional Network for Dense Crowd Counting , 2016, ACM Multimedia.
[2] Xiaochun Cao,et al. Deep People Counting in Extremely Dense Crowds , 2015, ACM Multimedia.
[3] Reinhold Häb-Umbach,et al. Source counting in speech mixtures by nonparametric Bayesian estimation of an infinite Gaussian mixture model , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Siham Ouamour,et al. Proposal of a New Confidence Parameter Estimating the Number of Speakers -An experimental investigation- , 2010, J. Inf. Hiding Multim. Signal Process..
[5] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[6] Valentin Andrei,et al. Counting competing speakers in a timeframe - human versus computer , 2015, INTERSPEECH.
[7] Björn W. Schuller,et al. Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data , 2017, Semantic Audio.
[8] Noel E. O'Connor,et al. Fully Convolutional Crowd Counting on Highly Congested Scenes , 2016, VISIGRAPP.
[9] Ian D. Reid,et al. DeepSetNet: Predicting Sets with Deep Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[10] Franck Giron,et al. Deep neural network based instrument extraction from music , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[12] Ramprasaath R. Selvaraju,et al. Counting Everyday Objects in Everyday Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Douglas A. Reynolds,et al. An overview of automatic speaker diarization systems , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Mathieu Salzmann,et al. Deep Convolutional Neural Networks for Human Embryonic Cell Counting , 2016, ECCV Workshops.
[15] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Takayuki Kawashima,et al. Perceptual limits in a simulated “Cocktail party” , 2015, Attention, Perception, & Psychophysics.
[18] Tuomas Virtanen,et al. TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).
[19] Xiaogang Wang,et al. Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] K. P. Choi. On the medians of gamma distributions and an equation of Ramanujan , 1994 .
[21] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[22] Takayuki Arai,et al. Estimating number of speakers by the modulation characteristics of speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[23] Roland Badeau,et al. Singing voice detection with deep recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[25] Hong Gu,et al. Nonlinear Poisson regression using neural networks: a simulation study , 2009, Neural Computing and Applications.
[26] Jordi Vitrià,et al. Learning to count with deep object features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[27] Jan Schlüter,et al. Learning to Pinpoint Singing Voice from Weakly Labeled Examples , 2016, ISMIR.
[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[29] Jun Li,et al. Crowd++: unsupervised speaker count with smartphones , 2013, UbiComp.
[30] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Hugo Van hamme,et al. Blind audio source counting and separation of anechoic mixtures using the multichannel complex NMF framework , 2015, Signal Process..
[33] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Margrit Betke,et al. Salient Object Subitizing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Nuno Vasconcelos,et al. Bayesian Poisson regression for crowd counting , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[36] Andrew Zisserman,et al. Counting in the Wild , 2016, ECCV.