Security and Communication Networks

Volume 2018, Article ID 6160125, 12 pages

https://doi.org/10.1155/2018/6160125

## An Alternative Method for Understanding User-Chosen Passwords

Correspondence should be addressed to Ping Wang; nc.ude.ukp@gnawp

Received 30 August 2017; Revised 2 December 2017; Accepted 27 December 2017; Published 28 January 2018

Academic Editor: Qi Jiang

Copyright © 2018 Zhixiong Zheng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We present in this paper an alternative method for understanding user-chosen passwords. In password research, much attention has been given to increasing the security and usability of individual passwords for common users. Few of them focus on the relationships between passwords; therefore we explore the relationships between passwords: modification-based, similarity-based, and probability-based. By regarding passwords as vertices, we shed light on how to transform a dataset of passwords into a password graph. Subsequently, we introduce some novel notions from graph theory and report on a number of inner properties of passwords from the perspective of graph. With the assistance of Python Graph-tool, we are able to visualize our password graph to deliver an intuitive grasp of user-chosen passwords. Five real-world password datasets are used in our experiments to fulfill our thorough experiments. We discover that some passwords in a dataset are tightly connected with each other; they have the tendency to gather together as a cluster like they are in a social network; password graph has logarithmic distribution for its degrees. Top clusters in password graph could be exploited to obtain the effective mangling rules for cracking passwords. Also, password graph can be utilized for a new kind of password strength meter.

#### 1. Introduction

The invention of computers ushered in a new era of digital lives, and text-based passwords have almost dominated human-computer authentication since then. Passwords continue to prevail on the web as the primary method for user authentication despite consensus among researchers that people deserve something more secure and user-friendly. Many researches have focused more on the specific point of the problem of user authentication involving no user interactions, which is literally the weakest point of the password authentication. There is all the time alternative authentication mechanisms aiming to outright replace password-based authentication method proposed by a bunch of researchers: graphical passwords [1], authentication based on biometric [2], authentication based on user behaviors [3], single sign-on system, and so forth. However, the growing password-based authentication adoptions [4] irrespective of their well-known security and usability drawbacks [5–7] reveal the fact that each of the alternatives to password-based authentication has its own shortcomings as compared to passwords [8]. The inertia of user habits, uncertain transition costs, and constant improvements in passwords lead to the result that incumbent passwords will continue as a stubborn signal for identity authentication in the foreseeable future, where the goal is not impregnable defense but the balance between usability and security [8].

Surveys are conducted by many researches [9–11] to reveal user real-world behaviors in managing passwords. The conflict between users’ limited memory and growing number of passwords they need to organize is the main challenge users are confronted with presently. Users reuse passwords or embed mnemonic information into password [12] to alleviate their burden of memorizing passwords. This implies the unpleasant fact that users select usability over security unconsciously. Security experts have suggested the use of password vaults (managers/wallets) to assist password-based authentication with which users need merely to remember one master password to encrypt all their online account passwords in the vault [13, 14].

Before 2009, password researchers proposed some heuristic methods to study the password-based authentication whose main purpose is to reveal the weaknesses of passwords and outright replace passwords in the end [5, 15]. After 2009, large-scale password datasets have been breached widely from hackers’ attacking or intruders’ intrusion, which are afterwards publicly available, for example, the leak of 32 M passwords from the gaming website RockYou in 2009 is currently the biggest corpus available to the public. In the literature of password, breaching of large-scale datasets of passwords have guided the studies of passwords to a more scientific and rigorous method. Password corpora have typically been used to analyze the distribution of passwords [16], names used in passwords and character distribution [9], or the priority among priorities: used as learning set to train probabilistic models such as PCFG-based [17], Markov-based [18], and NLP-based (Natural Language Processing) [19] password cracking algorithms in order to simulate adversarial password cracking process, leading to sophisticated password dataset strength evaluation methods [20]. The probabilistic password cracking models are afterwards modified elaborately by researchers to evaluate single password strength.

##### 1.1. Motivations

The analysis of the password security can date back at least to Morris and Thompsons 1979 seminal analysis of 3,000 passwords [21]. The analyzing methods they employed can be classified into two categories: password cracking and semantic evaluation. However, these two methodologies focus solely on the individual passwords and neglect implicitly the relationships between passwords. The neglected relationships on the contrary is remarkably important properties of password dataset that could facilitate the password cracking and semantic evaluation. Due to the intrinsically incomplete evaluation of traditional semantic and cracking methodologies, we advocate a new alternative method understanding user-chosen passwords.

In mathematics, graph is mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices, nodes, or points which are connected by edges that can either be directed or undirected. Graph-theoretic methods, in various forms, have been proven particularly useful in many fields such as linguistics, chemistry, physics, and sociology. While there is still no application of graph theory in the literature of passwords research, therefore, we are going to apply graph theory to password dataset to dig deeper into the inner properties of passwords to assess and compare the inherent security behaviors of users. We call this analysis method* relationship evaluation* as an extension to semantic evaluation. Our paper provides an alternative view of password relationships through password graph, we also provide a visualizing method for the generated password to observe and study it intuitively.

##### 1.2. Contributions

In this work, we make the following key contributions:(i)*An alternative view of password relationships*: we explore the relationships between passwords: modification-based, similarity-based, and probability-based. The modification-based relationship builds on the observation that a user usually modifies an existing password to retrieve a new one. Passwords are basically strings; thus we borrow the idea from string similarity to develop the password similarity-based relationship. The probability-based relationship is the idea derived from password distribution where each password has probability associated with it.(ii)*Visualizing of password graph*: by regarding passwords as vertices and leveraging one of the relationships we explored, we are able to transform a password dataset to a graph and we call it* password graph*. With the assistance of Python Graph-tool, we visualize our generated password graph; our visualization method can intuitively convey deeper characteristics of password dataset which lay otherwise under the hood and remain undiscovered.(iii)*Some insights from graph theory*: the resulting password graph provides us a fresh new perspective of password dataset. We will revisit some key terminology in graph theory to find out what our new password graph bring about.

##### 1.3. Organizations

In Section 2, we review prior research works on password cracking and semantic evaluations. Section 3 provides some preliminaries. Section 4 details our exploration of password relationships in dataset of password. Section 5 elaborates on our construction of password graph. Section 6 provides some key insights from graph theory to our password graph. Finally, Section 7 concludes our paper.

#### 2. Related Work

In this section, we briefly review prior pivotal research on password cracking and semantic evaluations to assist our follow-up discussions and explorations.

In 1979, Morris and Thompson analyzed a database of 3,000 passwords and reported some basic statistics: 71.12% passwords of their sample of passwords were 6 characters or fewer and 86% fell into one of the dictionaries, name lists, and the like. In 1990, Klein [22] collected/etc/passwd files in which passwords were in hash format from his friends and acquaintances in United States and Great Britain. 21% passwords were cracked in the first week, total approximately 25% of the passwords had been guessed, and 51.70% of the cracked passwords are not longer than 6 characters. Dedicated cracking software tools like John the Ripper [23] and hashcat [24] have appeared since and are armed with numerous cracking modes (e.g., brute-force attack, and dictionary attack)

Mangling rules in dictionary attack mode continue to evolve beyond heuristic rules: Weir et al. [17] built a machine learning technique based on context-free grammar to automatically derive mangling rules from a large training set of cleartext passwords. Houshmand and Aggrawal [18] derived Markov-based password cracking algorithm from Markov-Chain that representatively originated PageRank algorithm. Originally, Markov-based algorithm is not a probabilistic model; Ma et al. [25] investigated password characteristics about length and the structure of 6 datasets, 3 of which were from Chinese websites and improved it by using different normalization and smoothing methods. They found that when done correctly, Markov-based cracking model performed better than PCFG-based password cracking model. In 2012, Veras et al. [26] had done the work quite similar to us; they examined 32 M RockYou dataset by employing visualization techniques. They observed that 15.26% of passwords contained sequences of 5–8 consecutive digits, 38% of which could be further classified as dates, but their research mainly focused on password patterns which are different from ours.

The guessing resistance of a single user-chosen password was previously estimated by entropy, with reference to Claude Shannon’s famous measure . Borrowing the idea of Shannon entropy, a variation of Shannon entropy was proposed in NIST Electronic Authentication Guideline. It calculated the password entropy mainly based on the length of passwords and added partial points if some special heuristic checks were passed to make it more secure. Unfortunately, Shannon entropy and its variations characterize the strength of a distribution; for an attacker who wants just to crack a certain proportion of all passwords, Shannon entropy has no direct correlation to the guessing difficulty. Ad hoc metrics (password strength meter) had already been demonstrated far from accurate by Weir et al. [27]; they advocated that the cracking-based password strength meter was more compelling. Markov-based PSM [28] and PCFG-based PSM [18] were proposed subsequently based on Markov model and PCFG model. Wang et al. [12] created a novel PSM on a solid foundation where user usually reuses one of his/her passwords rather than creating a new one. By using two training sets (one as base dictionary and the other as rule learning dictionary), their PSM was able to derive empirically users’ mangling rules on passwords and thus more accurate.

In 2012, Bonneau conducted a large-scale analysis of 70 million Yahoo private passwords and proposed that a more direct password strength metric was guesswork (G) [29]. Yet in 2014, Li et al. [30] came to a conclusion that Bonneau’s 70 M passwords were not representative enough for all users, especially Chinese users who are not familiar with English. Chinese users prefer digits than letters. They also showed that Chinese users inclined to insert Pinyins and dates into their passwords.

Semantic patterns including personal information (e.g., birth dates, personal names, and nicknames) are prevalently embedded in user-chosen passwords [31]. Additionally, a few basic data characteristics like average length, length distribution, and types of characters used were typically reported. Further, the structure patterns were also studied by some researchers [9], passwords containing digits constituted more than 50% Chinese web passwords while this value of English counterpart was only 11.30%, reinforcing the hypothesis that user-generated passwords were greatly influenced by their native languages. More systematic methodologies had been proposed by Shay et al. [32] for creating a new password policy. One inconspicuous thing that semantic evaluations fail to attain is the relationships between passwords in the same dataset. Therefore, we for the first time augment this kind of evaluation method by introducing password relationship and graph theory into password evaluations. We hope this evaluation method helps to study the password evaluations thoroughly.

In 2010, Zhang et al. [33] found that modifications to one’s old passwords tend to be predictable, and they utilized this observation to facilitate password cracking. Their work is actually password reuse by a certain user; it is independent from other users. Our work focuses on password relationships between different users and classifies them into three classifications.

As far as we know, the work by Guo et al. [34] may be the closest one to what we have done in this work. They visualized several password datasets including Yahoo!, PhpBB, MySpace, Honeynet, hotmail, and 12306. Their discovery provides an explanation of the attacking curve that has long been observed in decades. We find one of their conclusions is wrong: degrees of passwords follow logarithmic law not power law as they claimed; we will detail this in Section 6. Overall, though our works are a bit similar, there are still very critical differences between our work and their work: (a) we explore the relationships between passwords while they do not the relationships between passwords; (b) we explore the effects of different thresholds while they only experimented with threshold 3; (c) our insights obtained from visualized graph is essentially distinct from their work.

#### 3. Preliminaries

In this section, we explicate formal definitions of graph and linear regression and then introduce the metric for evaluating how well the regression line approximates the real data points. Finally, we provide some basic information on our experimental datasets.

*Mathematical Notation*. We denote a password dataset with a calligraphic letter , a password after duplicates removing with another calligraphic letter . Let denote the total number of passwords in .

##### 3.1. Graph

*Formal Definition of Graph*. A graph can be defined as a pair , where is a set of vertices and is a set of edges between the vertices, :

Generally, graphs can be classified into two types: (a) undirected graph, the adjacency relation defined by the edges is symmetric. (b) Directed graph is a graph in which edges have orientations. A simple graph is an undirected graph in which both parallel edges and loops are disallowed while multigraph otherwise allows them.

##### 3.2. Linear Regression

In statistics, linear regression is an approach for modeling the correlation between two variables (one as a scalar dependent variable denoted by and the other as explanatory variable denoted by ) by fitting a linear equation to the experimental data. The most common method for linear regression is least-squares. Usually, in linear regression, given the value of explanatory variable , the value of dependent variable is an affine function of : , the slope of the line is , and is the intercept.

In linear regression, the statistical measure of how well the regression line approximates the real experimental data points is often calculated and compared through the coefficient of determination. People usually denote the coefficient of determination by , which has the range from 0 to 1, the closer to 1 the better. Therefore, a value of 1 indicates that all experimental data points are perfectly positioned on the regression line; of indicates the contrary result.

##### 3.3. Spearman’s Coefficient

In statistic, Spearman’s coefficient is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function. The Spearman coefficient is defined as the Pearson correlation coefficient between the two ranked vectors , : where is the covariance of the rank variables; and are the standard deviations of the rank variables. By definition, , where 0 indicates independence between the vectors. A perfect Spearman correlation of 1 or −1 happens when the agreement between the vectors is a monotone function.

##### 3.4. Password Guess Number

The password guess number characterizes the time complexity required for a password cracking algorithm (PCFG-based or Markov-based) to recover a password. This is generally achieved by measuring the guess number required to crack the password. Dell’Amico and Filippone [20] detail a Monte Carlo sampling method that converts a password probability as computed by PCFG or Markov model into an estimate of cracker’s guess number: where is the sample size and is the password draw randomly from the corresponding distribution.

##### 3.5. Datasets

For completeness, we give a brief description of the five datasets used in our experiments (see Table 1). They were breached either by hackers or by intruders and later disclosed publicly on the Internet; some of them have already been used in password cracking models [17, 18].