POS Tagging and its Applications for Mathematics
暂无分享,去创建一个
Content analysis of scientific publications is a nontrivial task, but a useful and important one for scientific information services. In the Gutenberg era it was a domain of human experts; in the digital age many machine-based methods, e.g., graph analysis tools and machine-learning techniques, have been developed for it. Natural Language Processing (NLP) is a powerful machine-learning approach to semiautomatic speech and language processing, which is also applicable to mathematics. The well established methods of NLP have to be adjusted for the special needs of mathematics, in particular for handling mathematical formulae. We demonstrate a mathematics-aware part of speech tagger and give a short overview about our adaptation of NLP methods for mathematical publications. We show the use of the tools developed for key phrase extraction and classification in the database zbMATH.
[1] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .
[2] Wolfram Sperber,et al. The DeLiVerMATH Project - Text Analysis in Mathematics , 2013, MKM/Calculemus/DML.
[3] Atro Voutilainen,et al. Comparing a Linguistic and a Stochastic Tagger , 1997, ACL.