You Are What Apps You Use: Demographic Prediction Based on User's Apps

Understanding the demographics of app users is crucial, for example, for app developers, who wish to target their advertisements more effectively. Our work addresses this need by studying the predictability of user demographics based on the list of a user's apps which is readily available to many app developers. We extend previous work on the problem on three frontiers: (1) We predict new demographics (age, race, and income) and analyze the most informative apps for four demographic attributes included in our analysis. The most predictable attribute is gender (82.3 % accuracy), whereas the hardest to predict is income (60.3 % accuracy). (2) We compare several dimensionality reduction methods for high-dimensional app data, finding out that an unsupervised method yields superior results compared to aggregating the apps at the app category level, but the best results are obtained simply by the raw list of apps. (3) We look into the effect of the training set size and the number of apps on the predictability and show that both of these factors have a large impact on the prediction accuracy. The predictability increases, or in other words, a user's privacy decreases, the more apps the user has used, but somewhat surprisingly, after 100 apps, the prediction accuracy starts to decrease.

[1]  Carlos Sarraute,et al.  Harnessing Mobile Phone Social Network Topology to Infer Users Demographic Attributes , 2014, SNAKDD'14.

[2]  Prasant Mohapatra,et al.  Your Installed Apps Reveal Your Gender and More! , 2015, MOCO.

[3]  Steven M. Bellovin,et al.  "I don't have a photograph, but you can have my footprints.": Revealing the Demographics of Location Data , 2015, ICWSM.

[4]  Ingmar Weber,et al.  The demographics of web search , 2010, SIGIR.

[5]  Hua Li,et al.  Demographic prediction based on user's browsing behavior , 2007, WWW '07.

[6]  Prasant Mohapatra,et al.  Predicting user traits from a snapshot of apps installed on a smartphone , 2014, MOCO.

[7]  Carlos Sarraute,et al.  A study of age and gender seen through mobile phone usage patterns in Mexico , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[8]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[9]  Fusheng Wang,et al.  A Comparative Study of Demographic Attribute Inference in Twitter , 2015, ICWSM.

[10]  Wendy Liu,et al.  Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[11]  Maeve Duggan,et al.  Social Media Update 2016 , 2016 .

[12]  Aron Culotta,et al.  Predicting the Demographics of Twitter Users from Website Traffic Data , 2015, AAAI.

[13]  Ingmar Weber,et al.  Who uses web search for what: and how , 2011, WSDM '11.

[14]  Sharad Goel,et al.  Who Does What on the Web: A Large-Scale Study of Browsing Behavior , 2012, ICWSM.

[15]  Daniel Gatica-Perez,et al.  Mining large-scale smartphone data for personality studies , 2013, Personal and Ubiquitous Computing.