Evaluating User Actions as a Proxy for Email Significance

Email remains a critical channel for communicating information in both personal and work accounts. The number of emails people receive every day can be overwhelming, which in turn creates challenges for efficient information management and consumption. Having a good estimate of the significance of emails forms the foundation for many downstream tasks (e.g. email prioritization); but determining significance at scale is expensive and challenging. In this work, we hypothesize that the cumulative set of actions on any individual email can be considered as a proxy for the perceived significance of that email. We propose two approaches to summarize observed actions on emails, which we then evaluate against the perceived significance. The first approach is a fixed-form utility function parameterized on a set of weights, and we study the impact of different weight assignment strategies. In the second approach, we build machine learning models to capture users' significance directly based on the observed actions. For evaluation, we collect human judgments on email significance for both personal and work emails. Our analysis suggests that there is a positive correlation between actions and significance of emails and that actions performed on personal and work emails are different. We also find that the degree of correlation varies across people, which may reflect the individualized nature of email activity patterns or significance. Subsequently, we develop an example of real-time email significance prediction by using action summaries as implicit feedback at scale. Evaluation results suggest that the resulting significance predictions have positive agreement with human assessments, albeit not at statistically strong levels. We speculate that we may require personalized significance prediction to improve agreement levels.

[1]  Dotan Di Castro,et al.  You've got Mail, and Here is What you Could do With It!: Analyzing and Predicting Actions on Email Messages , 2016, WSDM.

[2]  Anoop Gupta,et al.  Supporting Email Workflow , 2001 .

[3]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[4]  J. Shane Culpepper,et al.  The Effect of Document Order and Topic Difficulty on Assessor Agreement , 2016, ICTIR.

[5]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[6]  Mark Sanderson,et al.  Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.

[7]  Milad Shokouhi,et al.  Finding Email in a Multi-Account, Multi-Device World , 2016, CHI.

[8]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[9]  Milad Shokouhi,et al.  Characterizing and Predicting Email Deferral Behavior , 2019, WSDM.

[10]  Victoria Bellotti,et al.  E-mail as habitat: an exploration of embedded personal information management , 2001, INTR.

[11]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[12]  Simon Roy,et al.  A marketplace for attention: Responses to a synthetic currency used to signal information importance in e-mail , 2008, First Monday.

[13]  Robert E. Kraut,et al.  Should I open this email?: inbox-level cues, curiosity and attention to email , 2011, CHI.

[14]  Philipp Schaer,et al.  Better than Their Reputation? On the Reliability of Relevance Assessments with Students , 2012, CLEF.

[15]  Candace L. Sidner,et al.  Email overload: exploring personal information management of email , 1996, CHI.

[16]  Olle Bälter,et al.  Bifrost inbox organizer: giving users control over the inbox , 2002, NordiCHI '02.

[17]  Filip Radlinski,et al.  Understanding and Modeling Success in Email Search , 2017, SIGIR.

[18]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[19]  James Morris,et al.  Pricing Electronic Mail to Solve the Problem of Spam , 2005, Hum. Comput. Interact..

[20]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[21]  Ryen W. White,et al.  No clicks, no problem: using cursor movements to understand and improve search , 2011, CHI.

[22]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[23]  Robert G. Capra,et al.  Work and personal e-mail use by university employees: PIM practices across domain boundaries , 2013, J. Assoc. Inf. Sci. Technol..

[24]  Yiming Yang,et al.  Mining social networks for personalized email prioritization , 2009, KDD.

[25]  Susan T. Dumais,et al.  The Lifetime of Email Messages: A Large-Scale Analysis of Email Revisitation , 2018, CHIIR.

[26]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[27]  Carman Neustaedter,et al.  The Social Network and Relationship Finder: Social Sorting for Email Triage , 2005, CEAS.

[28]  Robert E. Kraut,et al.  Understanding email use: predicting action on a message , 2005, CHI.

[29]  Susan T. Dumais,et al.  Characterizing and Predicting Enterprise Email Reply Behavior , 2017, SIGIR.

[30]  Andrew Slater,et al.  The Learning Behind Gmail Priority Inbox , 2010 .

[31]  Anthony Tang,et al.  Going with the flow: email awareness and task management , 2006, CSCW '06.

[32]  Yiming Yang,et al.  Modeling personalized email prioritization: classification-based and regression-based approaches , 2011, CIKM '11.