论文信息 - Active EM to reduce noise in activity recognition

Active EM to reduce noise in activity recognition

Intelligent desktop environments allow the desktop user to define a set of projects or activities that characterize the user's desktop work. These environments then attempt to identify the current activity of the user in order to provide various kinds of assistance. These systems take a hybrid approach in which they allow the user to declare their current activity but they also employ learned classifiers to predict the current activity to cover those cases where the user forgets to declare the current activity. The classifiers must be trained on the very noisy data obtained from the user's activity declarations. Instead of asking the user to review and relabel the data manually, we employ an active EM algorithm that combines the EM algorithm and active learning. EM can be viewed as retraining on its own predictions. To make it more robust, we only retrain on those predictions that are made with high confidence. For active learning, we make a small number of queries to the user based on the most uncertain instances. Experimental results on real users show this active EM algorithm can significantly improve the prediction precision, and that it performs better than either EM or active learning alone.

Thomas G. Dietterich | Jianqiang Shen | Jianqiang Shen

[1] Rayid Ghani,et al. Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[2] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[3] Craig A. Knoblock,et al. Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[4] Claire Cardie,et al. Limitations of Co-Training for Natural Language Learning from Large Datasets , 2001, EMNLP.

[5] Victor Kaptelinin,et al. UMEA: translating interaction histories into project contexts , 2003, CHI '03.

[6] H. Sebastian Seung,et al. Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[7] Dana Angluin,et al. Learning from noisy examples , 1988, Machine Learning.

[8] Matthai Philipose,et al. The Probabilistic Activity Toolkit: Towards Enabling Activity-Aware Computer Interfaces , 2003 .

[9] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[10] Thomas G. Dietterich,et al. A hybrid learning system for recognizing user tasks from desktop activities and email messages , 2006, IUI '06.

[11] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.