Closed form maximum likelihood estimator of conditional random fields

Training Conditional Random Fields (CRFs) can be very slow for big data. In this paper, we present a new training method for CRFs called {\em Empirical Training} which is motivated by the concept of co-occurrence rate. We show that the standard training (unregularized) can have many maximum likelihood estimations (MLEs). Empirical training has a unique closed form MLE which is also a MLE of the standard training. We are the first to identify the \emph{Test Time Problem} of the standard training which may lead to low accuracy. Empirical training is immune to this problem. Empirical training is also unaffected by the label bias problem even it is locally normalized. All of these have been verified by experiments. Experiments also show that empirical training reduces the training time from weeks to seconds, and obtains competitive results to the standard and piecewise training on linear-chain CRFs, especially when data are insufficient.