Data-Driven Design: Beyond A/B Testing

A/B testing has become the de facto standard for optimizing design, helping designers craft more effective user experiences by leveraging data. A typical A/B test involves dividing user traffic between two experimental conditions (A and B), and looking for statistically significant differences in performance indicators (e.g., conversion rates) between them. While this technique is popular, there are other, powerful data-driven methods --- complementary to A/B testing --- that can tie design choices to desired outcomes. Mining data from existing designs can expose designers to a greater space of divergent solutions than A/B testing [1,4] ,RICO:2017. Since companies cannot predict a priori if the engineering effort for creating alternatives will be commensurate with a performance increase, they often test small changes, along gradients to local optima. With the millions of websites and mobile apps available today, it is likely that almost any UX problem a designer encounters has already been considered and solved by someone. The challenges are finding relevant existing solutions, measuring their performance, and correlating these metrics with design features. Recent systems that capture and aggregate interaction data from third-party Android apps --- with zero code integration --- open-source analytics that were previously locked away in each app, allowing designers to test and compare UI/UX patterns found in the wild: [2,3] 2017. Lightweight prototypes with tight user feedback loops, or experimentation engines, can bootstrap product design involving technologies that are actively being developed (e.g., artificial intelligence, virtual/augmented reality), where both use cases and capabilities are not well-understood [5]. These systems afford staged automation: initially, "Wizard of Oz'' techniques can scaffold needfinding, and eventually be replaced with automated solutions informed by the collected data. For example, a chatbot deployed on social media can serve as an experimentation engine for automating fashion advice [7]. At first, a pool of personal stylists can power the chatbot to collect organic conversations revealing common fashion problems, effective interaction patterns for addressing them, and design considerations for automation. Once technologies are developed to scale useful interventions [8,9], the chatbot platform provides a testbed for iteratively refining them. Generative models trained on a set of effective design examples can support predictive workflows that allow designers to rapidly prototype new, performant solutions [6]. Models such as generative adversarial networks and variational autoencoders can produce designs based on high-level constraints, or complete them given partial specifications. For example, a mobile wireframing tool backed by such a model could suggest adding "username" and "password" input fields to a screen with a centrally placed "login" button.