Email mining: tasks, common techniques, and tools

[1]  Yiming Yang,et al.  A study of thresholding strategies for text categorization , 2001, SIGIR '01.

[2]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[3]  M. KleinbergJon Authoritative sources in a hyperlinked environment , 1999 .

[4]  Virgílio A. F. Almeida,et al.  Improving Spam Detection Based on Structural Similarity , 2005, SRUTI.

[5]  Cheng-Hsien Tang,et al.  Enterprise Email Classification Based on Social Network Features , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[6]  M. C. Jones,et al.  E. Fix and J.L. Hodges (1951): An Important Contribution to Nonparametric Discriminant Analysis and Density Estimation: Commentary on Fix and Hodges (1951) , 1989 .

[7]  Jeffrey O. Kephart,et al.  MailCat: an intelligent assistant for organizing e-mail , 1999, AGENTS '99.

[8]  Munmun De Choudhury,et al.  Inferring relevant social networks from interpersonal communication , 2010, WWW '10.

[9]  Salvatore J. Stolfo,et al.  A Behavior-Based Approach to Securing Email Systems , 2003, MMM-ACNS.

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[12]  G. Golub Matrix computations , 1983 .

[13]  Christopher Meek,et al.  Challenges of the Email Domain for Text Classification , 2000, ICML.

[14]  Adam Perer,et al.  Contrasting portraits of email practices: visual approaches to reflection and analysis , 2006, AVI '06.

[15]  Fernanda B. Viégas,et al.  Visualizing email content: portraying relationships from conversational histories , 2006, CHI.

[16]  Simon Otjes,et al.  The Netherlands: The Netherlands , 2010 .

[17]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[18]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[19]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Judit Bar-Ilan,et al.  Web Data Management Practices: Emerging Techniques and Technologies , 2007 .

[21]  Bernardo A. Huberman,et al.  E-Mail as Spectroscopy: Automated Discovery of Community Structure within Organizations , 2005, Inf. Soc..

[22]  David B. Skillicorn,et al.  Structure in the Enron Email Dataset , 2005, Comput. Math. Organ. Theory.

[23]  Carman Neustaedter,et al.  Understanding sequence and reply relationships within email conversations: a mixed-model visualization , 2003, CHI '03.

[24]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[25]  Tobias Scheffer Email answering assistance by semi-supervised text classification , 2004, Intell. Data Anal..

[26]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[27]  Peter Bruza,et al.  Discovery of Implicit and Explicit Connections Between People Using Email Utterance , 2003, ECSCW.

[28]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[29]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[30]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[31]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[32]  P. Oscar Boykin,et al.  Personal Email Networks: An Effective Anti-Spam Tool , 2004, ArXiv.

[33]  Salvatore J. Stolfo,et al.  Behavior Profiling of Email , 2003, ISI.

[34]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[35]  Dit-Yan Yeung,et al.  A learning approach to spam detection based on social networks , 2007 .

[36]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[37]  Hongyuan Zha,et al.  Exploring Support Vector Machines and Random Forests for Spam Detection , 2004, CEAS.

[38]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[39]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[40]  Andrew McCallum,et al.  The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email , 2005 .

[41]  Michael F. Schwartz,et al.  Discovering shared interests using graph analysis , 1993, CACM.

[42]  Michael Gertz,et al.  Mining email social networks in Postgres , 2006, MSR '06.

[43]  Bradley Taylor,et al.  Sender Reputation in a Large Webmail Service , 2006, CEAS.

[44]  Marco Stuit,et al.  Discovery and analysis of e-mail-driven business processes , 2012, Inf. Syst..

[45]  Stephen E. Robertson,et al.  Probabilistic models of indexing and searching , 1980, SIGIR '80.

[46]  Marino Segnan Web Data Management Practices - Emerging Techniques and Technologies , 2007, Comput. J..

[47]  Yiming Yang,et al.  Mining social networks for personalized email prioritization , 2009, KDD.

[48]  Yossi Matias,et al.  Suggesting friends using the implicit social graph , 2010, KDD.

[49]  Gordon V. Cormack,et al.  Spam Corpus Creation for TREC , 2005, CEAS.

[50]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[51]  Jian Pei,et al.  Finding email correspondents in online social networks , 2013, World Wide Web.

[52]  William W. Cohen,et al.  Ranking Users for Intelligent Message Addressing , 2008, ECIR.

[53]  Peter Willett,et al.  Readings in information retrieval , 1997 .

[54]  Shlomo Hershkop,et al.  Automated social hierarchy detection through email network analysis , 2007, WebKDD/SNA-KDD '07.

[55]  Minoru Sasaki,et al.  Spam detection using text clustering , 2005, 2005 International Conference on Cyberworlds (CW'05).

[56]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[57]  Gordon V. Cormack,et al.  A Study of Supervised Spam Detection applied to Eight Months of Personal E-Mail , 2004 .

[58]  I. Cloete,et al.  Learning to classify email: a survey , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[59]  Chris Watkins,et al.  Proceedings of the European Conference on Machine Learning (ECML) , 2006 .

[60]  Al Bredenberg E-Mail: Spam , 2011, Encyclopedia of Information Assurance.

[61]  Enrico Blanzieri,et al.  A survey of learning-based techniques of email spam filtering , 2008, Artificial Intelligence Review.

[62]  Patrick D. McDaniel,et al.  Email Communities of Interest , 2007, CEAS.

[63]  Scott P. Robertson,et al.  Proceedings of the SIGCHI Conference on Human Factors in Computing Systems , 1991 .

[64]  Jason D. M. Rennie ifile: An Application of Machine Learning to E-Mail Filtering , 2000 .

[65]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[66]  George M. Mohay,et al.  E-Mail Authorship Attribution for Computer Forensics , 2002, Applications of Data Mining in Computer Security.

[67]  John C. Tang,et al.  Am I wasting my time organizing email?: a study of email refinding , 2011, CHI.

[68]  George M. Mohay,et al.  Multi-Topic E-mail Authorship Attribution Forensics , 2001 .

[69]  Irena Koprinska,et al.  Learning to classify e-mail , 2007, Inf. Sci..

[70]  Carman Neustaedter,et al.  Beyond "from" and "received": exploring the dynamics of email triage , 2005, CHI Extended Abstracts.

[71]  Jerome L. Myers,et al.  Research Design and Statistical Analysis , 1991 .

[72]  Robert E. Kraut,et al.  Email overload at work: an analysis of factors associated with email strain , 2006, IEEE Engineering Management Review.

[73]  Georgios Paliouras,et al.  Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach , 2000, ArXiv.

[74]  Nicolas Ducheneaut,et al.  In Search of Coherence: A Review of E-Mail Research , 2005, Hum. Comput. Interact..

[75]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[76]  Candace L. Sidner,et al.  Email overload: exploring personal information management of email , 1996, CHI.

[77]  George M. Mohay,et al.  Identifying the authors of suspect email , 2001 .

[78]  Andrea Lockerd Thomaz,et al.  DriftCatcher: The Implicit Social Context of Email , 2003, INTERACT.

[79]  Athena Vakali,et al.  Web Data Management Practices: Emerging Techniques and Technologies , 2007 .

[80]  Henrik Ernstson,et al.  Social Network Analysis (SNA) , 2012 .

[81]  James A. Hendler,et al.  Reputation Network Analysis for Email Filtering , 2004, CEAS.

[82]  Thomas Karagiannis,et al.  WWW 2009 MADRID! Track: Social Networks and Web 2.0 / Session: Diffusion and Search in Social Networks Behavioral Profiles for Advanced Email Features , 2022 .

[83]  Steffen Bickel,et al.  Learning from Message Pairs for Automatic Email Answering , 2004, ECML.

[84]  Tobias Schrödel E-Mail & Spam , 2014 .

[85]  Ian Smith,et al.  Quality Versus Quantity: E-Mail-Centric Task Management and Its Relation With Overload , 2005, Hum. Comput. Interact..

[86]  Bradley Malin,et al.  Email alias detection using social network analysis , 2005, LinkKDD '05.

[87]  Michael Gertz,et al.  Mining email social networks , 2006, MSR '06.

[88]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[89]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[90]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[91]  Marie-Francine Moens,et al.  Highly discriminative statistical features for email classification , 2012, Knowledge and Information Systems.

[92]  Olle Bälter,et al.  Keystroke level analysis of email message organization , 2000, CHI.