Prediction of Mathematical Expression Constraints (ME-Con)

This paper presents two different prediction models of Mathematical Expression Constraints (ME-Con) in technical publications. Based on the assumption of independent probability distributions, two types of features: FS, based on the ME symbols; FW, based on the words adjacent to MEs, are used for analysis. The first prediction model is based on an iterative greedy scheme aiming to optimize the performance goal. The second scheme is based on naïve Bayesian inference of the two different feature types considering the likelihood of the training data. The first model achieved an average F1 scores of 69.5% (based on the tests made on an Elsevier dataset). The second prediction model using FS achieved 82.4% for F1 score and 81.8% accuracy. And it achieved similar yet slightly higher F1 scores as that of the first model for the word stems of FW, but slightly lower F1 score for the Part-Of-Speech (POS) tags of FW.1