暂无分享,去创建一个
Yuan Wang | Amr Ahmed | Paul Pu Liang | Manzil Zaheer | M. Zaheer | Amr Ahmed | Yuan Wang | P. Liang
[1] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[2] Thomas L. Griffiths,et al. Infinite latent feature models and the Indian buffet process , 2005, NIPS.
[3] E. Candès. The restricted isometry property and its implications for compressed sensing , 2008 .
[4] Sariel Har-Peled,et al. On coresets for k-means and k-median clustering , 2004, STOC '04.
[5] Jens Lehmann,et al. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.
[6] Michael I. Jordan,et al. Feature allocations, probability functions, and paintboxes , 2013, 1301.6647.
[7] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..
[8] Jeff M. Phillips,et al. Coresets and Sketches , 2016, ArXiv.
[9] Sariel Har-Peled,et al. Coresets for $k$-Means and $k$-Median Clustering and their Applications , 2018, STOC 2004.
[10] Jay Pujara,et al. Mining Knowledge Graphs From Text , 2018, WSDM.
[11] Liang Liu,et al. An efficient deep learning hashing neural network for mobile visual search , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[12] Michael J Daniels,et al. A Bayesian nonparametric approach to marginal structural models for point treatments and a continuous or survival outcome. , 2017, Biostatistics.
[13] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[14] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[15] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[16] Hideki Nakayama,et al. Compressing Word Embeddings via Deep Compositional Code Learning , 2017, ICLR.
[17] Andrew McCallum,et al. Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.
[18] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.
[19] Eric P. Xing,et al. Feature Selection via Block-Regularized Regression , 2008, UAI.
[20] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[21] Qian Liu,et al. Task-oriented Word Embedding for Text Classification , 2018, COLING.
[22] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[23] Robert M. Haralick,et al. Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..
[24] Michael I. Jordan,et al. Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models , 2012, NIPS.
[25] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[26] Chun Chen,et al. Efficient manifold ranking for image retrieval , 2011, SIGIR.
[27] Wenhu Chen,et al. How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection , 2019, NAACL.
[28] Andreas Krause,et al. Practical Coreset Constructions for Machine Learning , 2017, 1703.06476.
[29] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[30] Ivan Markovsky,et al. Low Rank Approximation - Algorithms, Implementation, Applications , 2018, Communications and Control Engineering.
[31] Stefan Thater,et al. A Mixture Model for Learning Multi-Sense Word Embeddings , 2017, *SEMEVAL.
[32] Andreas Krause,et al. Coresets for Nonparametric Estimation - the Case of DP-Means , 2015, ICML.
[33] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .
[34] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[35] Hassan Foroosh,et al. Sparse Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Aravindan Vijayaraghavan,et al. Towards Learning Sparsely Used Dictionaries with Arbitrary Supports , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).
[37] Michael I. Jordan,et al. Bayesian Nonparametrics: Hierarchical Bayesian nonparametric models with applications , 2010 .
[38] Maarten de Rijke,et al. Manifold Learning for Rank Aggregation , 2018, WWW.
[39] Lawrence Carin,et al. A Stick-Breaking Construction of the Beta Process , 2010, ICML.
[40] Hugo Liu,et al. ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .
[41] James Zou,et al. Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems , 2019, 2021 IEEE International Symposium on Information Theory (ISIT).
[42] Andrey V. Savchenko,et al. Compression of Recurrent Neural Networks for Efficient Language Modeling , 2019, Appl. Soft Comput..
[43] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[44] Noah A. Smith,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016, ACL 2016.
[45] Yee Whye Teh,et al. Collapsed Variational Inference for HDP , 2007, NIPS.
[46] Quoc V. Le,et al. Neural Input Search for Large Scale Recommendation Models , 2019, KDD.
[47] Xin Dong,et al. Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon , 2017, NIPS.
[48] Sanjiv Kumar,et al. Adaptive Methods for Nonconvex Optimization , 2018, NeurIPS.
[49] Zhirong Yang,et al. Word Embedding Based on Low-Rank Doubly Stochastic Matrix Decomposition , 2018, ICONIP.
[50] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[51] Michael I. Jordan,et al. MAD-Bayes: MAP-based Asymptotic Derivations from Bayes , 2012, ICML.
[52] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[53] Ling Shao,et al. Learning to Hash With Optimized Anchor Embedding for Scalable Retrieval , 2017, IEEE Transactions on Image Processing.
[54] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.
[55] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[56] Juan Enrique Ramos,et al. Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .
[57] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[58] Le Nguyen Hoai Nam,et al. Integrating Low-rank Approximation and Word Embedding for Feature Transformation in the High-dimensional Text Classification , 2017, KES.
[59] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..
[60] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .
[61] Ke Jiang,et al. Small-Variance Asymptotics for Hidden Markov Models , 2013, NIPS.
[62] Yixin Chen,et al. Compressing Neural Networks with the Hashing Trick , 2015, ICML.
[63] Yiran Chen,et al. Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size , 2016, ArXiv.
[64] Yizhou Sun,et al. Learning K-way D-dimensional Discrete Code For Compact Embedding Representations , 2017, ICML.
[65] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.
[66] Ole Winther,et al. Hash Embeddings for Efficient Word Representations , 2017, NIPS.
[67] Rahul Goel,et al. Online Embedding Compression for Text Classification using Low Rank Matrix Factorization , 2018, AAAI.
[68] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[69] Zoubin Ghahramani,et al. Nonparametric Bayesian Sparse Factor Models with application to Gene Expression modelling , 2010, The Annals of Applied Statistics.
[70] Andrew Gordon Wilson,et al. Probabilistic FastText for Multi-Sense Word Embeddings , 2018, ACL.
[71] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[72] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[73] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.
[74] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.
[75] Wonyong Sung,et al. Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..
[76] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[77] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.
[78] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[79] Jure Leskovec,et al. {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .
[80] Stratis Ioannidis,et al. Guess Who Rated This Movie: Identifying Users Through Subspace Clustering , 2012, UAI.
[81] T. Griffiths,et al. Bayesian nonparametric latent feature models , 2007 .
[82] Thomas L. Griffiths,et al. The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..
[83] Zhi Jin,et al. Compressing Neural Language Models by Sparse Word Representations , 2016, ACL.
[84] Yoshua Bengio,et al. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model , 2008, IEEE Transactions on Neural Networks.
[85] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[86] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[87] S. Mallat. A wavelet tour of signal processing , 1998 .
[88] Wenlin Chen,et al. Strategies for Training Large Vocabulary Neural Language Models , 2015, ACL.
[89] Dmitry P. Vetrov,et al. Bayesian Compression for Natural Language Processing , 2018, EMNLP.
[90] Michael I. Jordan,et al. Hierarchical Bayesian Nonparametric Models with Applications , 2008 .
[91] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.