Creating User Interface Mock-ups from High-Level Text Descriptions with Deep-Learning Models

The design process of user interfaces (UIs) often begins with articulating high-level design goals. Translating these high-level design goals into concrete design mock-ups, however, requires extensive effort and UI design expertise. To facilitate this process for app designers and developers, we introduce three deep-learning techniques to create low-fidelity UI mock-ups from a natural language phrase that describes the high-level design goal (e.g. “pop up displaying an image and other options”). In particular, we contribute two retrieval-based methods and one generative method, as well as pre-processing and post-processing techniques to ensure the quality of the created UI mock-ups. We quantitatively and qualitatively compare and contrast each method’s ability in suggesting coherent, diverse and relevant UI design mock-ups. We further evaluate these methods with 15 professional UI designers and practitioners to understand each method’s advantages and disadvantages. The designers responded positively to the potential of these methods for assisting the design process.

[1]  John Canny,et al.  Scones: towards conversational authoring of sketches , 2020, IUI.

[2]  Janis Postels,et al.  Variational Transformer Networks for Layout Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Alec Radford,et al.  Zero-Shot Text-to-Image Generation , 2021, ICML.

[5]  Douglas Eck,et al.  A Neural Representation of Sketch Drawings , 2017, ICLR.

[6]  Jason Baldridge,et al.  Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO , 2020, EACL.

[7]  Ming-Hsuan Yang,et al.  Neural Design Network: Graphic Layout Generation with Constraints , 2019, European Conference on Computer Vision.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Michael S. Bernstein,et al.  Apparition: Crowdsourced User Interfaces that Come to Life as You Sketch Them , 2015, CHI.

[10]  Xiaojuan Ma,et al.  MetaMap: Supporting Visual Metaphor Ideation through Multi-dimensional Example-based Exploration , 2021, CHI.

[11]  Tovi Grossman,et al.  Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning , 2021, UIST.

[12]  Jing Yu Koh,et al.  Cross-Modal Contrastive Learning for Text-to-Image Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Colin Potts,et al.  Design of Everyday Things , 1988 .

[14]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[15]  Jeffrey Nichols,et al.  Rico: A Mobile App Dataset for Building Data-Driven Design Applications , 2017, UIST.

[16]  Matthias Jarke,et al.  UISketch: A Large-Scale Dataset of UI Element Sketches , 2021, CHI.

[17]  Haijun Xia,et al.  Crosspower: Bridging Graphics and Linguistics , 2020, UIST.

[18]  Toby Jia-Jun Li,et al.  Screen2Vec: Semantic Embedding of GUI Screens and GUI Components , 2021, CHI.

[19]  Thomas F. Liu,et al.  Learning Design Semantics for Mobile Apps , 2018, UIST.

[20]  Xin Zhou,et al.  Mapping Natural Language Instructions to Mobile UI Action Sequences , 2020, ACL.

[21]  READ: Recursive Autoencoders for Document Layout Generation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Alessandro Achille,et al.  Layout Generation and Completion with Self-attention , 2020, ArXiv.

[23]  S. Srihari Mixture Density Networks , 1994 .

[24]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25]  Jeffrey Nichols,et al.  Swire: Sketch-based User Interface Retrieval , 2019, CHI.

[26]  Tingfa Xu,et al.  LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators , 2019, ICLR.