Empirical Analysis of Password Reuse and Modification across Online Service

Leaked passwords from data breaches can pose a serious threat to users if the password is reused elsewhere. With more online services getting breached today, there is still a lack of large-scale quantitative understanding of the risks of password reuse across services. In this paper, we analyze a large collection of 28.8 million users and their 61.5 million passwords across 107 services. We find that 38% of the users have reused exactly the same password across different sites, while 20% have modified an existing password to create new ones. In addition, we find that the password modification patterns are highly consistent across different user demographics, indicating a high predictability. To quantify the risk, we build a new training-based guessing algorithm, and show that more than 16 million password pairs can be cracked within just 10 attempts (30% of the modified passwords and all the reused passwords).

[1]  Sudhir Aggarwal,et al.  Password Cracking Using Probabilistic Context-Free Grammars , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[2]  Michael K. Reiter,et al.  The security of modern password expiration: an algorithmic framework and empirical analysis , 2010, CCS '10.

[3]  Lujo Bauer,et al.  Of passwords and people: measuring the effect of password-composition policies , 2011, CHI.

[4]  Julie Thorpe,et al.  On Semantic Patterns of Passwords and their Security Impact , 2014, NDSS.

[5]  Ting Wang,et al.  PARS: A Uniform and Open-source Password Analysis and Research System , 2015, ACSAC 2015.

[6]  Lujo Bauer,et al.  Guess Again (and Again and Again): Measuring Password Strength by Simulating Password-Cracking Algorithms , 2011, 2012 IEEE Symposium on Security and Privacy.

[7]  Shouling Ji,et al.  Zero-Sum Password Cracking Game: A Large-Scale Empirical Study on the Crackability, Correlation, and Security of Passwords , 2017, IEEE Transactions on Dependable and Secure Computing.

[8]  Paul C. van Oorschot,et al.  Revisiting password rules: facilitating human management of passwords , 2016, 2016 APWG Symposium on Electronic Crime Research (eCrime).

[9]  Blase Ur,et al.  Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks , 2016, USENIX Annual Technical Conference.

[10]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[11]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[12]  Blase Ur,et al.  Measuring Real-World Accuracies and Biases in Modeling Password Guessability , 2015, USENIX Security Symposium.

[13]  Mohammad Mannan,et al.  From Very Weak to Very Strong: Analyzing Password-Strength Meters , 2014, NDSS.

[14]  Haining Wang,et al.  A study of personal information in human-chosen passwords and its security implications , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[15]  Christof Paar,et al.  Statistics on Password Re-use and Adaptive Strength for Financial Accounts , 2014, SCN.

[16]  Rick Wash,et al.  Understanding Password Choices: How Frequently Entered Passwords Are Re-used across Websites , 2016, SOUPS.

[17]  Blase Ur,et al.  Measuring password guessability for an entire university , 2013, CCS.

[18]  Ping Wang,et al.  Targeted Online Password Guessing: An Underestimated Threat , 2016, CCS.

[19]  Vitaly Shmatikov,et al.  Fast dictionary attacks on passwords using time-space tradeoff , 2005, CCS '05.

[20]  Nikita Borisov,et al.  The Tangled Web of Password Reuse , 2014, NDSS.

[21]  Blase Ur,et al.  How Does Your Password Measure Up? The Effect of Strength Meters on Password Creation , 2012, USENIX Security Symposium.

[22]  Cormac Herley,et al.  A large-scale study of web password habits , 2007, WWW '07.

[23]  Ninghui Li,et al.  A Study of Probabilistic Password Models , 2014, 2014 IEEE Symposium on Security and Privacy.