Backoff inspired features for maximum entropy language models
暂无分享,去创建一个
[1] Jun Wu,et al. Building a topic-dependent maximum entropy model for very large corpora , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[2] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[3] Ronald Rosenfeld,et al. A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..
[4] Sanjeev Khudanpur,et al. Efficient Subsampling for Training Complex Language Models , 2011, EMNLP.
[5] Ronald Rosenfeld,et al. Trigger-based language models: a maximum entropy approach , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[6] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[7] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[8] Cyril Allauzen,et al. Bayesian Language Model Interpolation for Mobile Speech Input , 2011, INTERSPEECH.
[9] Sophia Ananiadou,et al. Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty , 2009, ACL.
[10] Jun Wu,et al. Efficient training methods for maximum entropy language modeling , 2000, INTERSPEECH.
[11] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..
[12] Ruhi Sarikaya,et al. Joint Morphological-Lexical Language Modeling for Processing Morphologically Rich Languages With Application to Dialectal Arabic , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Francoise Beaufays,et al. “Your Word is my Command”: Google Search by Voice: A Case Study , 2010 .
[14] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[15] Mikko Kurimo,et al. Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension , 2010, INTERSPEECH.
[16] Brian Roark,et al. Discriminative n-gram language modeling , 2007, Comput. Speech Lang..
[17] Thorsten Brants,et al. Large Language Models in Machine Translation , 2007, EMNLP.
[18] Gideon S. Mann,et al. MapReduce/Bigtable for Distributed Optimization , 2010 .
[19] Hermann Ney,et al. Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR , 2013, INTERSPEECH.
[20] Roni Rosenfeld,et al. A whole sentence maximum entropy language model , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[21] Ronald Rosenfeld,et al. A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..
[22] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.
[23] Ronald Rosenfeld,et al. Whole-sentence exponential language models: a vehicle for linguistic-statistical integration , 2001, Comput. Speech Lang..
[24] Gideon S. Mann,et al. Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.