In an effort to assess the strength of passwords, password strength checkers count lower-case and upper-case letters, digits and other characters. However, this does not truly measure how likely a given password is. To determine the likelihood of a password, one must first understand how passwords are generated – this chapter takes a first step in that direction. This is particularly important in a mobile context, where users already are tempted to use short and simple passwords – given how arduous password entry is. 2.1 Why We Need to Understand Passwords While we do not think that passwords are the best way for people to authenticate to their devices and service providers, it is important to recognize the degree to which passwords are part of infrastructure, which makes them difficult to replace – even if we agree on what to replace them with. This chapter describes a new method by which we can address two common problems relating to traditional passwords. The first problem is that of approximating the security of a given credential. Traditional password strength checkers plainly demand the presence of certain predicates – such as a combination of uppercase and lowercase; the inclusion of numerals; and that passwords do not match any of the most common passwords (such as “abc123”). This is not necessarily the optimal strategy, as it does not capture common transformations (such as from to common password “password” to the very similar “passw0rd”). Rather than extending the blacklist to all common variants of all common passwords, it is better to understand the underlying structure of passwords, and how people generate them. This allows us to score passwords based on how they were generated – doing this allows us to determine that “p1a2s3s4w5o6r7d” is somehow less secure than “a1d9o8g4.” Without an understanding of how passwords are generated, the former password is likely to be believed to be the stronger of the two. (Of course, if mindless exhaustive search is the only path of compromise, the former password is the strongest of the two – so security must be seen in context of the most prevalent threat.) 5 M. Jakobsson, Mobile Authentication, SpringerBriefs in Computer Science, DOI: 10.1007/978-1-4614-4878-5_2, The Author(s) 2013 6 2 The Benefits of Understanding Passwords The second problem this chapter addresses is how to identify credential reuse – whether sequentially for one account, or consecutively between two or more accounts. Here, we do not only consider verbatim reuse, but also approximate reuse – such as “BigTomato” and “bigTOMATO1.” If a person has two accounts with different passwords then loses one of the passwords to a fraudster, then the fraudster has a reasonable chance to get access to the other account as well. This is because the attacker may try all common transformations of the stolen credential, hoping that one of them will work for the second account. As a result, we need to identify and discourage both verbatim and approximate password reuse. While verbatim reuse can be detected without any understanding of the underlying credentials, detection of approximate reuse requires a structural understanding of passwords. The two techniques described herein are closely related, and are both based on the parsing and decomposition of passwords, using rules matching those people are relying on when they generate passwords. Examples of such rules are insertion of one component into another component; concatenation of components; and common transformations of elements of a component. 2.2 People Make Passwords A good password is hard to guess. Conversely, of course, a bad password is easy to guess. But what is it that makes something hard to guess, and how can we tell? It is easier to tell that something is easy to guess than that it is hard to guess. For example, the following potential passwords are easy to guess: fraternity (a dictionary word); $L (a very short string); qwertyuiop (a string with a very predictable pattern); and LoveLoveMeDo (famous lyrics). Similarly, one can look at the commonality of passwords – any user who wants to use a password that has already reached the limit has to think of another password. This approach is taken by Schechter, Herley, and Mitzenmacher [81]. We can make a long list of reasons to consider a password to be weak – and this is what typical password strength checkers do – but how can we tell that we have not missed some? To be able to determine what makes most sense, we need to understand how passwords are constructed. Passwords are constructed by people, and people follow guidelines and mental protocols when performing tasks. Therefore, a better understanding of passwords requires a better understanding of people – or at least how people construct passwords. To gain a better understanding of this, we collected a very large number of actual passwords. We sampled and reviewed these, thinking carefully about how each password was constructed. It is meaningful to think of passwords as strings that are composed of components, where components are dictionary words, numbers, and other characters. When producing a password, a typical user composes a password from a small number of such components using one or more rules. The most common rule is concatenation, followed by replacement, spelling mistake, and insertion. Here, an example of a concatenation is producing “passbay1” 2.3 Building a Parser 7 from the three components “pass,” “bay,” and “1.” Use of L33T1 is a common replacement strategy, creating “s3v3nty” from “seventy” by replacement of each “e” with a “3.” Misspellings may be intentional or unintentional, resulting in passwords such as “clostrofobic.” Finally, insertion produces strings such as “Christi77na,” where “77” was inserted into the name “Christina.” (This was the least common type of rule among those we surveyed and the hardest to automatically detect in practice, so this rule was not used in the experiment we describe herein.) The simple insight that people choose passwords suggests a new approach to determining the strength of a password: One can determine the components making up the password and the commonality of each such component; one could then consider the mental generation rules used to combine the components and make up the password – along with the commonality of these rules being used. The strength of the password, in some sense, depends directly on the commonality of the password, which in turn depends on the commonality of its components and password generation rules. Similarly, when determining the similarity of two passwords, one can compare the components that make up the two passwords along with the generation rules. It is therefore important to understand the commonality of components and rules. 2.3 Building a Parser Components and Rules To build a parser, we need to understand the components and the generation rules, and then “decompile” passwords into the components they were made from. We will therefore review how most passwords are formed, in order to understand how to invert this process.
[1]
Frank Stajano,et al.
Understanding scam victims
,
2011,
Commun. ACM.
[2]
M. Angela Sasse,et al.
Security Education against Phishing
,
2012
.
[3]
Markus Jakobsson,et al.
Cache cookies for browser authentication
,
2006,
2006 IEEE Symposium on Security and Privacy (S&P'06).
[4]
Michael K. Reiter,et al.
Cryptographic Key Generation from Voice (Extended Abstract)
,
2001
.
[5]
Daphna Weinshall,et al.
Cognitive authentication schemes safe against spyware
,
2006,
2006 IEEE Symposium on Security and Privacy (S&P'06).
[6]
John O. Pliam.
On the Incomparability of Entropy and Marginal Guesswork in Brute-Force Attacks
,
2000,
INDOCRYPT.
[7]
Daniel B. Horn,et al.
Patterns of entry and correction in large vocabulary continuous speech recognition systems
,
1999,
CHI '99.
[8]
Serge Egelman,et al.
It's No Secret. Measuring the Security and Reliability of Authentication via “Secret” Questions
,
2009,
2009 30th IEEE Symposium on Security and Privacy.
[9]
Stuart E. Schechter,et al.
Popularity Is Everything: A New Approach to Protecting Passwords from Statistical-Guessing Attacks
,
2010,
HotSec.
[10]
I. Scott MacKenzie,et al.
Text Entry for Mobile Computing: Models and Methods,Theory and Practice
,
2002,
Hum. Comput. Interact..
[11]
Markus Jakobsson,et al.
Love and authentication
,
2008,
CHI.
[12]
Lorrie Faith Cranor,et al.
Human selection of mnemonic phrase-based passwords
,
2006,
SOUPS '06.
[13]
Min Wu,et al.
Do security toolbars actually prevent phishing attacks?
,
2006,
CHI.
[14]
J. Yan,et al.
Password memorability and security: empirical results
,
2004,
IEEE Security & Privacy Magazine.
[15]
Dan Boneh,et al.
Stronger Password Authentication Using Browser Extensions
,
2005,
USENIX Security Symposium.
[16]
Jens Riegelsberger,et al.
The mechanics of trust: A framework for research and design
,
2005,
Int. J. Hum. Comput. Stud..
[17]
Sebastian Günther.
Folk Models of Home Computer Security
,
2012
.
[18]
Markus Jakobsson,et al.
What Instills Trust? A Qualitative Study of Phishing
,
2007,
Financial Cryptography.
[19]
Elaine Shi,et al.
BIND: a fine-grained attestation service for secure distributed systems
,
2005,
2005 IEEE Symposium on Security and Privacy (S&P'05).
[20]
Xiaolong Li,et al.
An Overview of Microsoft Web N-gram Corpus and Applications
,
2010,
NAACL.
[21]
Shumin Zhai,et al.
Relaxing stylus typing precision by geometric pattern matching
,
2005,
IUI.
[22]
Richard T. Lee,et al.
Nuclear Shape, Mechanics, and Mechanotransduction [2008;102:1307–1318] Emerin and the Nuclear Lamina in Muscle and Cardiac Disease [2008;103:16–23] Mechanical Control of Tissue Morphogenesis
,
2022
.
[23]
Lorrie Faith Cranor,et al.
Getting users to pay attention to anti-phishing education: evaluation of retention and transfer
,
2007,
eCrime '07.
[24]
Michael K. Reiter,et al.
Seeing-is-believing: using camera phones for human-verifiable authentication
,
2005,
2005 IEEE Symposium on Security and Privacy (S&P'05).
[25]
Serge Egelman,et al.
It's not what you know, but who you know: a social approach to last-resort authentication
,
2009,
SOUPS.
[26]
Ariel Rabkin,et al.
Personal knowledge questions for fallback authentication: security questions in the era of Facebook
,
2008,
SOUPS '08.
[27]
Shumin Zhai,et al.
The performance of touch screen soft buttons
,
2009,
CHI.
[28]
Robert W. Reeder,et al.
1 + 1 = you: measuring the comprehensibility of metaphors for configuring backup authentication
,
2009,
SOUPS.
[29]
Claude E. Shannon,et al.
Prediction and Entropy of Printed English
,
1951
.
[30]
Mike Just,et al.
Pictures or questions?: examining user responses to association-based authentication
,
2010,
BCS HCI.
[31]
Kori Inkpen Quinn,et al.
Gathering evidence: use of visual security cues in web browsers
,
2005,
Graphics Interface.
[32]
Markus Jakobsson,et al.
Using Cartoons to Teach Internet Security
,
2008,
Cryptologia.
[33]
Markus Jakobsson,et al.
Implicit authentication for mobile devices
,
2009
.
[34]
Mike Just,et al.
Personal choice and challenge questions: a security and usability assessment
,
2009,
SOUPS.
[35]
Markus Jakobsson,et al.
Quantifying the security of preference-based authentication
,
2008,
DIM '08.