A Note on an Approximate Learning Algorithm with Limited Parameters for Language Models

Summary A lot of research in the field of NLP(natural language processing) for AI(artificial intelligence) has the goal of learning language models. In general, the aim is to minimize the divergence between the approximate model and the true model, but most learning algorithms are based on the maximum likelihood method. The existence of finite samples with high likelihood doesn’t mean that the divergence between the approximate model and the true model is small. This paper proposes a new learning algorithm, the measure of which is divergence. The proposed algorithm is compared to previous algorithms using simulations.