Learning Network-Based Multi-Modal Mobile User Interface Embeddings

Rich multi-modal information - text, code, images, categorical and numerical data - co-exist in the user interface (UI) design of mobile applications. UI designs are composed of UI entities supporting different functions which together enable the application. To support effective search and recommendation applications over mobile UIs, we need to be able to learn UI representations that integrate latent semantics. In this paper, we propose a novel unsupervised model - Multi-modal Attention-based Attributed Network Embedding (MAAN) model. MAAN is designed to capture both multi-modal and structural network information. Based on the encoder-decoder framework, MAAN aims to learn UI representations that allow UI design reconstruction. The generated embedding can be applied to a variety of tasks: predicting UI elements associated with UI screens, inferring missing UI screen and element attributes, predicting UI user ratings, and retrieving UIs. Extensive experiments, including user evaluations, conducted on two datasets from RICO, a rich real-world mobile UI repository, demonstrates that MAAN out-performs other state-of-the-art models.

[1]  Yingtao Xie,et al.  User Interface Code Retrieval: A Novel Visual-Representation-Aware Approach , 2019, IEEE Access.

[2]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[3]  Matthew D. Hoffman,et al.  Variational Autoencoders for Collaborative Filtering , 2018, WWW.

[4]  Yuchen Li,et al.  BiANE: Bipartite Attributed Network Embedding , 2020, SIGIR.

[5]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[6]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[7]  Jeffrey Nichols,et al.  Swire: Sketch-based User Interface Retrieval , 2019, CHI.

[8]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[9]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[10]  David J. Fleet,et al.  VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.

[11]  Xiangliang Zhang,et al.  Co-Embedding Attributed Networks , 2019, WSDM.

[12]  Clemens Holzmann,et al.  Measuring Visual User Interface Complexity of Mobile Applications With Metrics , 2018, Interact. Comput..

[13]  Tsvi Kuflik,et al.  Assessing the Contribution of Twitter's Textual Information to Graph-based Recommendation , 2017, IUI.

[14]  Khashayar Rohanimanesh,et al.  Discovering Surprising Documents with Context-Aware Word Representations , 2018, IUI.

[15]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[16]  Yanfang Ye,et al.  Heterogeneous Graph Attention Network , 2019, WWW.

[17]  Tat-Seng Chua,et al.  Neural Graph Collaborative Filtering , 2019, SIGIR.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  Jeffrey Nichols,et al.  Rico: A Mobile App Dataset for Building Data-Driven Design Applications , 2017, UIST.

[20]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[21]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[22]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[23]  Lior Rokach,et al.  Session-Based Recommendations Using Item Embedding , 2017, IUI.

[24]  Jie Tang,et al.  Representation Learning for Attributed Multiplex Heterogeneous Network , 2019, KDD.

[25]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[26]  Thomas F. Liu,et al.  Learning Design Semantics for Mobile Apps , 2018, UIST.

[27]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[28]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[29]  Yongdong Zhang,et al.  LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation , 2020, SIGIR.

[30]  Shangsong Liang,et al.  Semi-supervisedly Co-embedding Attributed Networks , 2019, NeurIPS.

[31]  Sungahn Ko,et al.  GUIComp: A GUI Design Assistant with Real-Time, Multi-Faceted Feedback , 2020, CHI.

[32]  Feiran Huang,et al.  Network embedding by fusing multimodal contents and links , 2019, Knowl. Based Syst..

[33]  Wencheng Wang,et al.  Interactive Storytelling for Movie Recommendation through Latent Semantic Analysis , 2018, IUI.

[34]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .