论文信息 - Estimating Age on Twitter Using Self-Training Semi-Supervised SVM

Estimating Age on Twitter Using Self-Training Semi-Supervised SVM

The estimation methods for Twitter user’s attributes typically require a vast amount of labeled data. Therefore, an efficient way is to tag the unlabeled data and add it to the set. We applied the self-training SVM as a semisupervised method for age estimation and introduced Plat scaling as the unlabeled data selection criterion in the self-training process. We show how the performance of the self-training SVM varies when the amount of training data and the selection criterion values are changed.

Satoshi Endo | Naruaki Toma | Koji Yamada | Yuhei Akamine | Tatsuyuki Iju

[1] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[2] H. J. Scudder,et al. Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[3] Ana-Maria Popescu,et al. Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.

[4] John D. Burger,et al. Discriminating Gender on Twitter , 2011, EMNLP.