Application of SVM in Citation Information Extraction

Support Vector Machines are an effective form of binary-class classification algorithm. To enhance the utilization of text structural features for information extraction, which are greatly restricted by the Hidden Markov Model (HMM), this paper proposes a support vector machine multi-class classification based on Markov properties to extract the information from a citation database. The proposed model extracts symbol characteristics as features and composes a binary tree of the transition probabilities. Experiments show that the proposed method outperforms HMM and basic SVM methods.