Extracting User Profiles from E-mails Using the Set-Oriented Classifier

More and more people rely on e-mails rather than postal letters to communicate to each other. Although e-mails are more convenient, letters still have many positive features. The ability to handle "anonymous recipient" is one of them. This paper proposes a software agent that performs the routing task as human beings for the anonymous recipient e-mails. The software agent named "TWIMC (To Whom It May Concern)" receives anonymous recipient e-mails, analyze it, and then routes the e-mail to the mostly qualified person (i.e., email account) inside the organization. The agent employs the Set-oriented Classifier System (SCS) that is a genetic algorithm classifier that uses set representation internally. The comparison of SCS with the Support Vector Machine (SVM) shows that the SCS outperforms SVM under noisy environment.

[1]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[2]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[3]  Vipin Kumar,et al.  Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification , 2001, PAKDD.

[4]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[5]  Isabelle Moulinier,et al.  Applying an existing machine learning algorithm to text categorization , 1995, Learning for Natural Language Processing.

[6]  Kenneth A. De Jong,et al.  Learning Concept Classification Rules Using Genetic Algorithms , 1991, IJCAI.

[7]  Manuel Valenzuela-Rendón,et al.  Reinforcement learning in the fuzzy classifier system , 1998 .

[8]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[9]  Sean Saxon,et al.  XCS and the Monk's Problems , 1999, Learning Classifier Systems.

[10]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[11]  Bogju Lee,et al.  A set-oriented genetic algorithm and the knapsack problem , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[12]  Thorsten Joachims,et al.  Detecting Concept Drift with Support Vector Machines , 2000, ICML.

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .