Clustering Main Concepts from e-Mails

E–mail is one of the most common ways to communicate, assuming, in some cases, up to 75% of a company’s communication, in which every employee spends about 90 minutes a day in e–mail tasks such as filing and deleting. This paper deals with the generation of clusters of relevant words from E–mail texts. Our approach consists of the application of text mining techniques and, later, data mining techniques, to obtain related concepts extracted from sent and received messages. We have developed a new clustering algorithm based on neighborhood, which takes into account similarity values among words obtained in the text mining phase. The potential of these applications is enormous and only a few companies, mainly large organizations, have invested in this project so far, taking advantage of employees’s knowledge in future decisions.