暂无分享,去创建一个
[1] Wenguang Chen,et al. SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs , 2017, ASPLOS.
[2] Andrew McCallum,et al. Efficient methods for topic model inference on streaming document collections , 2009, KDD.
[3] Yuxiong He,et al. GRNN: Low-Latency and Scalable RNN Inference on GPUs , 2019, EuroSys.
[4] Yelong Shen,et al. End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture , 2015, NIPS.
[5] Xiaojin Zhu,et al. A Topic Model for Word Sense Disambiguation , 2007, EMNLP.
[6] Jianxun Liu,et al. Functional and Contextual Attention-Based LSTM for Service Recommendation in Mashup Creation , 2019, IEEE Transactions on Parallel and Distributed Systems.
[7] Ulrich Meyer,et al. Delta-Stepping: A Parallel Single Source Shortest Path Algorithm , 1998, ESA.
[8] Yee Whye Teh,et al. A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.
[9] Rio Yokota,et al. Exhaustive Study of Hierarchical AllReduce Patterns for Large Messages Between GPUs , 2019, 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).
[10] John Canny,et al. SAME but Different: Fast and High Quality Gibbs Parameter Estimation , 2014, KDD.
[11] Alexander J. Smola,et al. Exponential Stochastic Cellular Automata for Massively Parallel Inference , 2016, AISTATS.
[12] Marc Snir,et al. Aluminum: An Asynchronous, GPU-Aware Communication Library Optimized for Large-Scale Training of Deep Neural Networks on HPC Systems , 2018, 2018 IEEE/ACM Machine Learning in HPC Environments (MLHPC).
[13] Takuya Akiba,et al. ChainerMN: Scalable Distributed Deep Learning Framework , 2017, ArXiv.
[14] Yao Zhang,et al. Scan primitives for GPU computing , 2007, GH '07.
[15] Yibo Wang,et al. Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud , 2018, Decis. Support Syst..
[16] G. C. Wei,et al. A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .
[17] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[18] L. Tierney. Markov Chains for Exploring Posterior Distributions , 1994 .
[19] Fan Yao,et al. XBFS: eXploring Runtime Optimizations for Breadth-First Search on GPUs , 2019, HPDC.
[20] Inderjit S. Dhillon,et al. A Scalable Asynchronous Distributed Algorithm for Topic Modeling , 2014, WWW.
[21] Fei-Fei Li,et al. Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[22] Ralf Krestel,et al. Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.
[23] Bo Wu,et al. Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[24] Lixin Gao,et al. Maiter: An Asynchronous Graph Processing Framework for Delta-based Accumulative Iterative Computation , 2017, 1710.05785.
[25] Feng Yan,et al. Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units , 2009, NIPS.
[26] Wenguang Chen,et al. WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation , 2015, Proc. VLDB Endow..
[27] Julian Kates-Harbeck,et al. Training distributed deep recurrent neural networks with mixed precision on GPU clusters , 2017, MLHPC@SC.
[28] Mohamed Wahib,et al. Scalable Kernel Fusion for Memory-Bound GPU Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[29] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[30] Wenguang Chen,et al. Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data , 2018, ICS.
[31] John D. Owens,et al. GPU Computing , 2008, Proceedings of the IEEE.
[32] David M. Blei,et al. Relational Topic Models for Document Networks , 2009, AISTATS.
[33] Jonathan Weese,et al. UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.
[34] William H. Press,et al. Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .
[35] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[36] Bin Cui,et al. LDA*: A Robust and Large-scale Topic Modeling System , 2017, Proc. VLDB Endow..
[37] Chi Zhang,et al. Locality-Aware Software Throttling for Sparse Matrix Operation on GPUs , 2018, USENIX Annual Technical Conference.
[38] Alexander J. Smola,et al. Reducing the sampling complexity of topic models , 2014, KDD.
[39] Kurt Keutzer,et al. Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT , 2020, AAAI.
[40] Andreas Gerstlauer,et al. Start Late or Finish Early: A Distributed Graph Processing System with Redundancy Reduction , 2018, Proc. VLDB Endow..
[41] John D. Owens,et al. Gunrock: a high-performance graph processing library on the GPU , 2015, PPoPP.
[42] Edward Y. Chang,et al. Collaborative filtering for orkut communities: discovery of user latent behavior , 2009, WWW '09.
[43] Baobao Chang,et al. Syntax Aware LSTM Model for Chinese Semantic Role Labeling , 2017, ArXiv.
[44] John D. Owens,et al. A Dynamic Hash Table for the GPU , 2017, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[45] Max Welling,et al. Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.
[46] Yun Liang,et al. CuLDA_CGS: solving large-scale LDA problems on GPUs , 2018, PPoPP.
[47] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[48] Jun Zhu,et al. Distributing the Stochastic Gradient Sampler for Large-Scale LDA , 2016, KDD.
[49] Tie-Yan Liu,et al. LightLDA: Big Topic Models on Modest Computer Clusters , 2014, WWW.
[50] Wenguang Chen,et al. Efficient Document Analytics on Compressed Data: Method, Challenges, Algorithms, Insights , 2018, Proc. VLDB Endow..