Sampled Softmax with Random Fourier Features
暂无分享,去创建一个
[1] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Xavier Bouthillier,et al. Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets , 2014, NIPS.
[3] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[4] Koray Kavukcuoglu,et al. Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.
[5] Pascal Vincent,et al. An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family , 2015, ICLR.
[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[7] Xinhua Zhang,et al. DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression , 2016, ArXiv.
[8] Pradeep Ravikumar,et al. Loss Decomposition for Fast Learning in Large Output Spaces , 2018, ICML.
[9] Sebastian Fedden,et al. Extreme classification , 2018, Cognitive Linguistics.
[10] Jian Cheng,et al. NormFace: L2 Hypersphere Embedding for Face Verification , 2017, ACM Multimedia.
[11] Garud Iyengar,et al. Unbiased scalable softmax optimization , 2018, ArXiv.
[12] Sashank J. Reddi,et al. Stochastic Negative Mining for Learning with Large Output Spaces , 2018, AISTATS.
[13] Sanjiv Kumar,et al. Orthogonal Random Features , 2016, NIPS.
[14] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[15] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[16] Stefano Ermon,et al. Fast Amortized Inference and Learning in Log-linear Models with Randomly Perturbed Nearest Neighbor Search , 2017, UAI.
[17] Manik Varma,et al. Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages , 2013, WWW.
[18] Guy Blanc,et al. Adaptive Sampled Softmax with Kernel Based Sampling , 2017, ICML.
[19] Vikas Sindhwani,et al. Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels , 2014, J. Mach. Learn. Res..
[20] Aapo Hyvärinen,et al. Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..
[21] Harish Karnick,et al. Random Feature Maps for Dot Product Kernels , 2012, AISTATS.
[22] Paul Covington,et al. Deep Neural Networks for YouTube Recommendations , 2016, RecSys.
[23] Thomas Gärtner,et al. Probabilistic Structured Predictors , 2009, UAI.
[24] Dennis DeCoste,et al. Compact Random Feature Maps , 2013, ICML.
[25] Carlos D. Castillo,et al. L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.
[26] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.
[27] Jonathon Shlens,et al. Deep Networks With Large Output Spaces , 2014, ICLR.
[28] Zoltán Szabó,et al. Optimal Rates for Random Fourier Features , 2015, NIPS.
[29] Moustapha Cissé,et al. Efficient softmax approximation for GPUs , 2016, ICML.
[30] Yoshua Bengio,et al. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model , 2008, IEEE Transactions on Neural Networks.
[31] Bhiksha Raj,et al. SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Manik Varma,et al. Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications , 2016, KDD.
[33] Sanjiv Kumar,et al. Spherical Random Features for Polynomial Kernels , 2015, NIPS.