Measuring Metrics

You get what you measure, and you can't manage what you don't measure. Metrics are a powerful tool used in organizations to set goals, decide which new products and features should be released to customers, which new tests and experiments should be conducted, and how resources should be allocated. To a large extent, metrics drive the direction of an organization, and getting metrics 'right' is one of the most important and difficult problems an organization needs to solve. However, creating good metrics that capture long-term company goals is difficult. They try to capture abstract concepts such as success, delight, loyalty, engagement, life-time value, etc. How can one determine that a metric is a good one? Or, that one metric is better than another? In other words, how do we measure the quality of metrics? Can the evaluation process be automated so that anyone with an idea of a new metric can quickly evaluate it? In this paper we describe the metric evaluation system deployed at Bing, where we have been working on designing and improving metrics for over five years. We believe that by applying a data driven approach to metric evaluation we have been able to substantially improve our metrics and, as a result, ship better features and improve search experience for Bing's users.

[1]  John A. Davis,et al.  Measuring marketing : 110+ key metrics every marketer needs , 2012 .

[2]  V. Ridgway Dysfunctional Consequences of Performance Measurements , 1956 .

[3]  Alex Deng,et al.  Data-Driven Metric Development for Online Controlled Experiments: Seven Lessons Learned , 2016, KDD.

[4]  Ron Kohavi,et al.  Improving the sensitivity of online controlled experiments by utilizing pre-experiment data , 2013, WSDM.

[5]  Suju Rajan,et al.  Beyond clicks: dwell time for personalization , 2014, RecSys '14.

[6]  Ryen W. White,et al.  Understanding and Predicting Graded Search Satisfaction , 2015, WSDM.

[7]  Ahmed Hassan Awadallah,et al.  Beyond DCG: user behavior as a predictor of a successful search , 2010, WSDM '10.

[8]  Phillip E. Pfeifer,et al.  Marketing Metrics: The Definitive Guide to Measuring Marketing Performance , 2010 .

[9]  Isa Steinmann,et al.  Mastering 'Metrics: The Path from Cause to Effect , 2015 .

[10]  Jane Li,et al.  Good abandonment in mobile and PC internet search , 2009, SIGIR.

[11]  S. Kerr On the folly of rewarding A, while hoping for B. , 1975, Academy of Management journal. Academy of Management.

[12]  Roger W. Schmenner,et al.  Performance Measures: Gaps, False Alarms, and the “Usual Suspects” , 1994 .

[13]  Nick Craswell,et al.  Beyond clicks: query reformulation as a predictor of search satisfaction , 2013, CIKM.

[14]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[15]  References , 1971 .

[16]  Jaime Teevan,et al.  Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.

[17]  Roi Blanco,et al.  Predicting Re-finding Activity and Difficulty , 2015, ECIR.

[18]  C. Burges,et al.  Learning to Rank Using Classification and Gradient Boosting , 2008 .

[19]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[20]  Diane Tang,et al.  Focusing on the Long-term: It's Good for Users and Business , 2015, KDD.

[21]  Ron Kohavi,et al.  Online controlled experiments at large scale , 2013, KDD.

[22]  Diane Tang,et al.  Focus on the Long-Term: It's better for Users and Business , 2015 .

[23]  John R. Hauser,et al.  Metrics: you are what you measure! , 1998 .

[24]  Ron Kohavi,et al.  Seven rules of thumb for web site experimenters , 2014, KDD.

[25]  Ashish Agarwal,et al.  Overlapping experiment infrastructure: more, better, faster experimentation , 2010, KDD.

[26]  Ron Kohavi,et al.  Trustworthy online controlled experiments: five puzzling outcomes explained , 2012, KDD.

[27]  R. Kaplan,et al.  The balanced scorecard--measures that drive performance. , 2015, Harvard business review.

[28]  Ricardo Valerdi,et al.  Navigating the Metrics Landscape: An Introductory Literature Guide to Metric Selection, Implementation, & Decision Making , 2009 .

[29]  Ron Kohavi,et al.  Responsible editor: R. Bayardo. , 2022 .

[30]  Bernard W Marr,et al.  PDF) Scoperte Scientifiche Non Autorizzate Oltre La Verita Ufficiale Brossura Marco Pizzuti (PDF) Application Engine Peoplesoft Interview Questions (PDF) Key Performance Indicators Kpi The 75 Measures Every Manager Needs To Know , 2016 .