Estimating Age on Twitter Using Self-Training Semi-Supervised SVM
暂无分享,去创建一个
The estimation methods for Twitter user’s attributes typically require a vast amount of labeled data. Therefore, an efficient way is to tag the unlabeled data and add it to the set. We applied the self-training SVM as a semisupervised method for age estimation and introduced Plat scaling as the unlabeled data selection criterion in the self-training process. We show how the performance of the self-training SVM varies when the amount of training data and the selection criterion values are changed.
[1] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .
[2] H. J. Scudder,et al. Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.
[3] Ana-Maria Popescu,et al. Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.
[4] John D. Burger,et al. Discriminating Gender on Twitter , 2011, EMNLP.