Learning to Complete Sentences

We consider the problem of predicting how a user will continue a given initial text fragment. Intuitively, our goal is to develop a “tab-complete” function for natural language, based on a model that is learned from text data. We consider two learning mechanisms that generate predictive models from collections of application-specific document collections: we develop an N-gram based completion method and discuss the application of instance-based learning. After developing evaluation metrics for this task, we empirically compare the model-based to the instance-based method and assess the predictability of call-center emails, personal emails, and weather reports.

[1]  Ian H. Witten,et al.  The Reactive Keyboard , 1992 .

[2]  Russell Greiner,et al.  Predicting UNIX Command Lines: Adjusting to User Patterns , 2000, AAAI/IAAI.

[3]  Hendrik Blockeel,et al.  User modeling with sequential data , 2003 .

[4]  Nestor Garay-Vitoria,et al.  A Comparison of Prediction Techniques to Enhance the Communication Rate , 2004, User Interfaces for All.

[5]  Constantine Stephanidis,et al.  User-Centered Interaction Paradigms for Universal Access in the Information Society , 2004, Lecture Notes in Computer Science.

[6]  Dino Pedreschi,et al.  Machine Learning: ECML 2004 , 2004, Lecture Notes in Computer Science.

[7]  Guy Lapalme,et al.  Text prediction for translators , 2002 .

[8]  Tina Magnuson,et al.  Measuring the effectiveness of word prediction: The advantage of long-term use , 2002 .

[9]  Philippe Langlais,et al.  Translators at work with TRANSTYPE: Resource and Evaluation , 2002, LREC.

[10]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[11]  Brian D. Davison,et al.  Predicting Sequences of User Actions , 1998 .

[12]  Tobias Scheffer,et al.  Sentence Completion , 1921, SIGIR '04.

[13]  Claude E. Shannon,et al.  Prediction and Entropy of Printed English , 1951 .

[14]  Matjaz Debevc,et al.  An adaptive short list for documents on the World Wide Web , 1997, IUI '97.

[15]  Hiroshi Motoda,et al.  Machine Learning Techniques to Make Computers Easier to Use , 1997, IJCAI.