Using Transduction and Multi-view Learning to Answer Emails

Many organizations and companies have to answer large amounts of emails. Often, most of these emails contain variations of relatively few frequently asked questions. We address the problem of predicting which of several frequently used answers a user will choose to respond to an email. Our approach effectively utilizes the data that is typically available in this setting: inbound and outbound emails stored on a server. We take into account that there are no explicit links between inbound and corresponding outbound mails on the server. We map the problem to a semi-supervised classification problem that can be addressed by algorithms such as the transductive support vector machine and multi-view learning. We evaluate our approach using emails sent to a corporate customer service department.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[3]  David D. Lewis,et al.  The TREC-5 Filtering Track , 1996, TREC.

[4]  Alfred Mele,et al.  Autonomous agents , 1995 .

[5]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[6]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[7]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[8]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[9]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[10]  Jeffrey O. Kephart,et al.  MailCat: an intelligent assistant for organizing e-mail , 1999, AGENTS '99.

[11]  Gary Boone,et al.  Concept features in Re:Agent, an intelligent Email agent , 1998, AGENTS '98.

[12]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[13]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[14]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[15]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[16]  Stan Matwin,et al.  Email classification with co-training , 2011, CASCON.

[17]  Peter Edwards,et al.  Using Machine Learning to Enhance Software Tools for Internet Information Management , 1996 .

[19]  Judy Kay,et al.  IEMS - The Intelligent Email Sorter , 2002, ICML.

[20]  Patrick Pantel,et al.  SpamCop: A Spam Classification & Organisation Program , 1998, AAAI 1998.

[21]  Paul N. Bennett Assessing the Calibration of Naive Bayes Posterior Estimates , 2000 .