Abstract

Keystroke dynamics based authentication is one of the prevention mechanisms used to protect one’s account from criminals’ illegal access. In this authentication mechanism, keystroke dynamics are used to capture patterns in a user typing behavior. Sequence alignment is shown to be one of effective algorithms for keystroke dynamics based authentication, by comparing the sequences of keystroke data to detect imposter’s anomalous sequences. In previous research, static divisor has been used for sequence generation from the keystroke data, which is a number used to divide a time difference of keystroke data into an equal-length subinterval. After the division, the subintervals are mapped to alphabet letters to form sequences. One major drawback of this static divisor is that the amount of data for this subinterval generation is often insufficient, which leads to premature termination of subinterval generation and consequently causes inaccurate sequence alignment. To alleviate this problem, we introduce sequence alignment of dynamic divisor (SADD) in this paper. In SADD, we use mean of Horner’s rule technique to generate dynamic divisors and apply them to produce the subintervals with different length. The comparative experimental results with SADD and other existing algorithms indicate that SADD is usually comparable to and often outperforms other existing algorithms.

1. Introduction

In an era which is full of electronic services, people want to have more convenient and faster ways to assist their needs. This includes reading emails, searching information through online communication, transferring files, and paying bills online. For example, online file transfer involves storage mechanisms in a cloud system. To use these services for our private needs, we must register a login ID and password. As for the online bill payment, we must login with our ID and password before we can pay or transfer money to other users. However, it can happen that criminals detect our login ID and password and utilize our credentials to commit crimes such as stealing important files or money. Therefore, stronger and more secure authentication mechanism has to be designed and implemented to prevent these issues.

Considerable amount of authentication mechanisms has been introduced. One of the examples is a biometric system [1]. Biometric system can be divided into two folds, which are physiology-based approach and behavior-based approach. Physiology-based approaches in the authentication system include use of iris, voice, and fingerprint. In contrast, behavior-based approach includes keystroke dynamics on keyboard, mouse, or smart phone. In this paper, we focus more on the behavior-based approach. Behavior-based approach has advantages that it is inexpensive, is easy to implement, and needs no extra hardware to operate. One possible drawback of the behavior-based approach is that its performance can be lower than that of physiology-based approach, but it can reduce the personal risk of a user. For example, in fingerprint-based authentication, an imposter can cut off the genuine user’s hand in order to access the security system. However, this kind of attack is ineligible for behavior-based approach. If the imposter wants to break a system, his choice can be limited in either forcing the genuine user to type it or emulating the genuine user’s typing by practice. Therefore, there is still a considerable amount of researchers performing behavior-based research because they believe that keystroke dynamics can improve the security and it will be a common approach to be used in the future [210].

Keystroke dynamics concerns timing details of people’s typing data [11]. The timing detail means the duration between pressing a key and releasing a key or vice versa. Keystroke dynamics can be applied with the machine learning to discover knowledge about people’s typing behavior [8], emotion [12], gender [3], dominant hand [4], and so forth. In this paper, we associate the typing behavior with the authentication system. It is reasonable to use the keystroke dynamics in the authentication system because some users have special typing patterns. For example, some people use only one hand to type whereas the others use both hands [4]. Moreover, Syed et al. [8] have proven three interesting hypotheses in their research. One of the evidence is that users present significantly dissimilar pattern when typing. Another hypothesis is about the relationship between users’ typing ability and the event sequence. The event sequence is defined as the sequence of key-up and key-down events with actual key values incorporated. More explanation for event sequence can be reviewed in their paper [8]. In their result, it is shown that there is no correlation between users’ typing skill and the event sequence. Last hypothesis discusses the effect of habituation on the event sequences of a user. It reveals that the keystroke dynamics data that is typed later is more representative than the data which is typed earlier. This means that it is difficult for imposters to imitate legitimate users’ typing patterns. Besides typing behavior, we can observe that different people have different walking styles (or gait). That is, we can conjecture a person correctly just by hearing the footstep without seeing him/her. It can be possible that, sometimes, we can surmise a person correctly by looking at the appearances and walking style from behind. In summary, we can identify a person easily by using these special characteristics or so-called special habits. These characteristics or habits include typing behavior.

The common timing details that we can obtain from keystroke dynamics are dwell time and flight time. Dwell time, also known as duration time [9], is the length of time between when a key is pressed and when a key is released. Flight time, also known as interval time [9], however, is the length of time between when a key is released and when a next key is pressed. More details are discussed in Section 4.

There is a considerable amount of machine learning algorithms introduced for keystroke dynamics such as naïve Bayes [13], support vector machines (SVM) [10], nearest-neighbor [2], and Euclidean distance measure [14]. In this paper, we choose sequence alignment algorithm. Sequence alignment algorithm is fundamentally concerned about measuring the similarity between multiple sequences. Suppose a user logs in an account in any platform, the system has to check the existence of her identification information before it checks her password. It is well known that this password matching involves complicated encryption of plain text in symmetric password systems [15] or modular arithmetic operations for asymmetric password systems [16]. However, in a more abstract view, this is basically string matching. When a system checks the password, firstly it checks the first letter from the currently inserted password with the first letter from the stored password in the database, followed by the second letter, the third letter, and so forth until all letters are verified. Sequence alignment is a more general and stronger algorithm which measures the similarity between objects. Hence, we consider that sequence alignment algorithm is an appropriate algorithm to be used in this paper.

Of the current research we are aware of, there is still no specific algorithm to be used as the common algorithm in keystroke dynamics research. However, in Revett’s research [7], he shows that the sequence alignment algorithm has performed sufficiently when it is applied into the keystroke dynamics. Besides that, some researchers have provided new idea into keystroke dynamics. Giot and Rosenberger’s [3] research introduces a new soft biometric for keystroke dynamics based on gender recognition. Idrus et al. [4] also introduce more valuable information such as the type of hand used, age, and the dominant hand. These extra amounts of information can be used as reference in order to help to improve the performance of algorithms when the algorithms are applied into keystroke dynamics. Furthermore, Syed et al. [8] show the concept of event sequences used in the keystroke dynamics. This event sequences help to distinguish the typing behavior of a user.

The next section describes sequence alignment algorithm. Section 3 discusses the proposed method. Sections 4 and 5 explain the experimental method and the results of the experiment, respectively. The final section presents several concluding remarks and future research issues.

2. Sequence Alignment Algorithm

Sequence alignment is an algorithm that calculates the similarity among two or more sequences [17]. This algorithm is widely used in bioinformatics areas such as deoxyribonucleic acid (DNA) sequences, ribonucleic acid (RNA) sequences, or protein sequences. In Revett’s [7] research, he has applied the sequence alignment algorithm into the keystroke dynamics and obtained encouraging results. In this paper, we show the performance comparison of our proposed method, Revett’s sequence alignment algorithm, and other previous work.

The keystroke dynamics is generated in timestamp format (millisecond). Since values in timestamp format vary and can be infinity, it is inappropriate to apply keystroke dynamics to sequence alignment algorithm directly. Therefore, we have to discretize the timestamp into subintervals. Each subinterval will represent a different category. For example, this process is similar to a questionnaire construction. We usually allow a user to choose few options, such as “strongly disagree,” “disagree,” “neither disagree nor agree,” “agree,” and “strongly agree.” Sometimes, we also just make it shorter to three options which are “disagree,” “neither disagree nor agree,” and “agree.” But usually we just put maximum options to five or six options. We do not put too many categories into the questionnaire because it is hard to be analyzed later. Revett [7] has used twenty categories (i.e., twenty bins) in his research. These twenty categories are extracted from the letters of amino acid. These letters represent “ACDEFGHIKLMNPQRSTVWY.” With the use of twenty bins in keystroke dynamics, it becomes much suitable for sequence alignment algorithm to be used.

We explain the algorithm design of sequence alignment in the following paragraphs. Firstly, we have to get the difference of the time interval from a feature, for instance, dwell time. This time interval difference is obtained from the difference between maximum time and minimum time. The maximum time of dwell time means the longest time for a user to press a key and release a key. The minimum time of dwell time, on the other hand, is the shortest time for a user to press a key and release a key. The formula is defined bywhere is the number of the column (attribute).

After we obtain the difference of the time interval from an attribute, we have to divide the difference of time intervals into twenty subintervals. The length of subintervals is defined bywhere is the number of the column. The reason why we have to get the rangeDiff is because we need to know the length of each category. If a new dwell time is close to the minimum time of dwell time, it will be categorized as letter A. The next category is C, followed by D, and so forth. If a new dwell time is close to the maximum dwell time, then it will be categorized as Y. At the end of calculation, each category has the exact same length, rangeDiff, because static divisor is used. Labeling (mapping) a new time can be formulated by where is the number of the row (also known as entry) and is the number of the column. If a new time of a feature is less than the minimum time of the feature or greater than the maximum time of the feature (it mostly happens in the testing phase), it will be categorized as a different alphabet letter and is excluded in the alignment. Equations (1), (2), and (3) are the equations used to map the keystrokes to proper alphabet letters.

Once a row of data (entry) is changed to the corresponding alphabet letters, we run the sequence alignment algorithm. One point is scored if the label is matched in an attribute. Otherwise, no points are scored. The score is described aswhere is the number of the column. When an entry is completely verified, we sum all the scores. It is then defined bywhere is the number of the row, is the number of the column, and is the total number of the columns. This final score is then compared with the user-specified threshold. The higher the final score is, the higher the possibility that the new data is recognized as the genuine user is. In order to present a clear description for the whole algorithm design, we display the pseudocode of this algorithm in Algorithm 1.

Input: Training data extracted from a genuine user, . Testing data extracted from a genuine user or an imposter, .
In the training phase:
(1) for each attribute   do
(2)
(3)
(4) end  for
(5) for each attribute   do
(6) diff
(7) rangeDiff diff/20
(8) end  for
(9) for each row and each attribute in   do
(10)label ceil(()/rangeDiff)
(11)if label = 0
(12)  label 1// If the data is exactly the same value as minimum time, then it should categorize as label
(13)model label// We just keep it as to represent
(14) end for
In the testing phase:
(1) Convert the to alphabet letter format based on the minimum point and the range difference from step 3 and step 7
   respectively, from training phase, and then we just run one time from step 9 to step 14 from training phase.
(2) for each row in model do
(3) for each attribute in   do
(4)    if = model
(5)    Score 1
(6) end for
(7) Final_Score sum(Score)
(8) end for
(9) Checking_Score mean(Final_Score)// Can be max, min, median, mean, and mode
(10) return Checking_Score
Output: The score then used to compare the threshold

In Algorithm 1, we can see that there are training phase and testing phase. In the training phase, the algorithm starts by calculating the maximum and the minimum for each attribute. It calculates the range difference (rangeDiff) for each attribute. Finally, it calculates labels to construct models.

In the testing phase, the algorithm converts the given testing data to alphabet letter format based on the minimum point obtained at step 3 and the range difference from step 7 in Algorithm 1. After that, it executes the conversion loop described from steps 9 to 14 for once on the test data. This is the conversion process described in step 1 in the testing phase.

After the conversion, for the actual recognition, for each row in the model, the algorithm calculates the score (Final_Score) for the row and the test data. This score is a summation of match scores (score in Algorithm 1) between the attribute in the row of the model and the attribute in the test data. Finally, the statistical summary of information including maximum, minimum, median, mean, and mode of the scores is calculated for the comparison with corresponding thresholds.

3. Sequence Alignment with Dynamic Divisor Generation Algorithm

In this paper, we propose sequence alignment with dynamic divisor generation (SADD) algorithm. SADD checks the degree of sufficiency of the dataset and then provides a proper divisor instead of static divisor as shown in (2) for each attribute. We show the steps to check the degree of sufficiency of the dataset and the reason we used dynamic divisor instead of static divisor in next paragraphs.

As a human, we are unfamiliar with a new thing immediately from the beginning. We have to practice a few times to get accustomed to the new thing. For example, consider an athlete who wants to run a 100-meter track in 10 seconds. However, this is nearly impossible if she is a beginner. She has to train hard and practice regularly. The time record from the first day until the day she manages to run a 100-meter track in 10 seconds could be illustrated as a graph which is shown in Figure 1. It is worth noting that, after few months of training, she will find it very difficult to reduce the time (i.e., less than 10 seconds) to finish the 100-meter track. This is because there is a limitation point of where we can reach even despite how hard we have trained and practiced. We called this phenomenon “realm point.” Realm point refers to the temporal or spatial point when the user is accustomed to something or the user has a habit on something. This is also a reason why the world record of 100 meters currently remained at 9.58 seconds.

The phenomenon (i.e., realm point) that we have discussed previously can be applied for most activities including the typing speed of a password. Unfortunately, it is difficult to know how many hours, days, weeks, months, or years needed to reach the realm point of typing speed. Every user will have different time to reach realm point even with all the other conditions fixed. We do not know if the users are reaching the realm point or not in the beginning of the experiment, and real authentication systems do not know this either. For the best case, the data we collected cover from the beginning of the day to the realm point (or after realm point). For the worst case, it can be from the beginning of the day until the middle of the days, as shown in Figure 2.

For (1) and (2), we have to find each category in each subinterval. Figures 3(a) and 3(b) present the graph when the best case and worst case are applied with (1) and (2). Note that the intervals highlighted inside the red rectangle box are the ones mapped to the alphabet letters and are stored as a model in the database. This constructed model will be used to verify the new data. In particular, for the worst case, it can be seen that the categories generated do not cover whole information. Hence, if the genuine user gets accustomed to the password typing, then the minimum time of the new data she has inserted is lower than the minimum time in the database. Thus, in turn, this new data has high probability to be rejected as the imposter because it is out of category. In order to avoid this problem, we propose SADD in this paper. We use a subset of alphabet letters (e.g., 15 bins) and apply it to the dataset if its contents are insufficient. The idea is illustrated in Figure 4. We point out this problem to ensure that the model that we used in the database is always faultless and still can be used as a rewind case like when a user is sick and her typing speed becomes slower than usual but her typing behavior is still the same.

Now, we explain the proposed algorithm design. Firstly, (1) remains the same. The main procedure of SADD is to find the correct and suitable divisor used in (2). To find a proper divisor in each attribute, we exploit Horner’s rule [18] to get the mean. Figure 5 shows the illustration of the way we use Horner’s rule to calculate the mean. The grey line stated with 2 (known as line 2) on the right hand side is the mean of the first point (maximum point) and the second point. Line 3 is the mean of line 2 and third point. Line 4, however, is the mean of line 3 and the fourth point. The calculation is repeated by summing the previous output and the new data and then halving the sum. The repetition is terminated if it is reached to the end of the data. As shown in Figure 5, line 11 is the mean of line 10 and eleventh point. We also can observe that line 11 is closed to the realm point. By obtaining line 11, we can conclude that this user has reached to the realm point. The calculation for mean of each attribute by using Horner’s rule is defined bywhere is the number of the row and is the number of the column.

From line 11 in Figure 5, we observe that the calculated mean is near to the minimum point (i.e., realm point) and is far from the maximum point (i.e., beginning point). To get the proper divisor, we have to find the ratio of the mean from minimum point and the maximum point to the mean. The formula is described aswhere is the number of the column and .

Once we obtain the ratio, we calculate the divisor bywhere is the number of the column, is 20 for twenty bins, and .

After that, we replace the twenty values from (2) to this new divisor. It is calculated aswhere is the number of the column.

Consider the worst case shown in Figure 3(b). Note that the first letter should not be labeled as “A” but a certain letter in the middle of the alphabet. Therefore, we modify (3) as follows:where is the number of the row, is the number of the column, and is for the twenty bins. This means that the new data which is closer to the maximum point will be classified as “Y” and followed by “W,” “V,” “T,” and so forth. The calculation of score and the final score remains the same as (4) and (5). In order to clearly explain our proposed algorithm, we provide Algorithm 2.

Input: Training data extracted from a genuine user, . Testing data extracted from a genuine user or an imposter, .
In the training phase:
(1) for each attribute   do
(2)
(3)
(4) end for
(5) for each attribute   do
(6) mean
(7) for each row   do// Start from second row
(8)   mean (mean + )/2
(9) end for
(10)diff
(11)ratio (mean − )/diff
(12)divisor 20 − (20 * ratio)
(13)rangeDiff (diff/divisor)
(14) end  for
(15) for each row and each attribute in   do
(16)label ceil(()/rangeDiff) + (20 − divisor)
(17)if label = 0
(18)  label 1// If the data is exactly the same value as the minimum time, then it should be categorized as label
(19)model label// We just preserve it as to represent
(20) end  for
In the testing phase:
(1) Convert the to the alphabet letter format based on the minimum point the rane difference from step 3 and step 7
   respectively, from training phase. And then we run from step 15 to step 20 for just once time.
(2) for each row in model do
(3)for each attribute in   do
(4)   if = model
(5)   Score 1
(6)end for
(7)Final_Score sum(Score)
(8) end for
(9) Checking_Score mean(Final_Score)// Can be max, min, median, mean, and mode
(10) return Checking_Score
Output: The score then used to compare the threshold

In the training phase of Algorithm 2, we calculate the mean by Horner’s rule in steps 5 to 9. We calculate ratio in (7) at step 11, divisor in (8) at step 12, and range difference in (9) at step 13. Label assignment described in (10) is implemented at step 16 of Algorithm 2.

4. Experimental Method

4.1. Training and Testing Phase

In the keystroke dynamics, there are six common timing elements that we can use between two different keystrokes. They are as follows.(i)Hold (H): it is a duration time (or a dwell time) of pressing a key:(a)a holding time of the first key, H1;(b)a holding time of the second key, H2.(ii)Up-down (UD): it is a duration time (or a flight time) between key-up of the first key and key-down of the second key.(iii)Down-down (DD): it is a duration time between key-down of the first key and key-down of the second key; it is the sum of H1 and UD.(iv)Up-up (UU): it is a duration time between key-up of the first key and key-up of the second key; it is the sum of UD and H2.(v)Down-up (DU): it is a duration time between key-down of the first key and key-up of the second key; it is the sum of DD and H2.

In our experiment, we use the CMU benchmark dataset [5]. This dataset consists of four elements, which are H1, H2, UD, and DD. However, we doubt that the DD element is not an appropriate element to be used in sequence alignment or similar algorithms. This is because sequence alignment algorithm is comparing attribute by attribute. In this case, it is checking element by element. Since DD is the sum of H1 and UD, there is a possibility to obtain the same value of DD with different value of H1 and UD. For instance, consider the following example with two different instances as shown in Figure 7.

Assume that Instance number 1 is the data that we have collected from a genuine user and it is used as model and Instance number 2 is a new data inserted by an imposter. Since it is a short example, we omit the procedure to convert the timestamp format into consensus sequence (label format). As we explain above, the sequence alignment algorithm is checking element by element. The first element it checks is H1. Based on (4), since the H1 of Instance number 2 is mismatched with the H1 of Instance number 1, the zero score is given. Next, the algorithm checks the UD element. They are mismatched too, and so zero score is given. Finally, the algorithm checks the last element, DD. Since they are matched to each other, one score is given. If the threshold given from a system is one, then Instance number 2 will be predicted as a genuine user. Otherwise, it will be rejected as an imposter’s. If the system predicts Instance number 2 as a genuine user, then it will cause a lot of loss to the genuine user. Besides that, the example we show is a short example. However, in the real life, there can be a considerable amount of DD elements. If it is too fortunate for an imposter whose entire DD elements are matched (due to DD element being the sum of H1 and UD) with some entries of genuine users in the stored model and the threshold given is the total number of DD elements, then the system will accept this instance as a genuine user. Then, the imposter can access the system.

Therefore, in order to discover the effectiveness of DD elements in the authentication, we create another dataset from benchmark dataset which is having no DD elements (by removing all DD attributes from original dataset). Other than that, we also want to evaluate how effective it is to use all elements in the authentication. Hence, we create another dataset from benchmark dataset which is having extra UU and DU elements (by adding UU and DU attributes into original dataset). Basically, the difference between these three datasets is the number of the attributes used in the experiment. The first dataset (Dataset number 1) consists of 31 attributes, the second dataset (Dataset number 2) consists of 21 attributes, and the last dataset (Dataset number 3) has 51 attributes.

As for the benchmark dataset and two extra datasets we have created, each dataset contains 20,400 typing password entries. In each dataset, it consists of 51 subjects involved in this experiment. Each subject has 400 entries. In our experiment, during the training phase, we select one subject as genuine user and the remaining 50 subjects as imposters. First 200 entries of the chosen subject are used as training data. However, the remaining 200 entries of the chosen subject are used as testing data. Besides that, we obtain the first five entries from the remaining 50 subjects as testing data too. In total, there are 450 entries used in the testing phase. Based on Killourhy and Maxion [5] research, the reason they use the top five from the remaining 50 subjects is because they assume the imposter is unfamiliar to the password. In our experiment, we will operate at least 51 times in each dataset, and thus the number of total runs for three datasets will be 153 runs.

4.2. Performance of Evaluation Using ROC Curves

The main performance of evaluation in our experiment is using receiver operating characteristic (ROC) curves. We compare our algorithm and other approaches by showing the graph in ROC curves. One of the examples of ROC curves is shown in Figure 6. The label at -axis is true positive rate. True positive rate is the rate when an imposter is detected. And the -axis labels the false positive rate, which is the rate when a genuine user is detected as an imposter. We can calculate the equal error rate (EER) from the ROC curve. EER is the point that is intersected by diagonal line and the curve line. We can conclude that an algorithm’s performance is higher than others if the curve line is closer to the 1.0 point -axis or the area under the curve (AUC) is larger than other algorithms.

5. Experimental Results

As aforementioned, we run three different datasets in our experiment. The first dataset is the CMU benchmark dataset, the second dataset is without the DD element, and the last dataset is with extra UU and DU elements. Table 1 shows the performance of our algorithm (SADD), sequence alignment, median vector proximity [6], Mahalanobis distance, Manhattan distance, and Euclidean distance. Table 1 has shown that our proposed algorithm produced higher performance than other previous work. Although the results for Dataset number 3 are not the best among the six algorithms, our algorithm, SADD, shows only 0.001 difference, in terms of EER, compared with the median vector proximity algorithm. Based on the results, it can be seen that our algorithm is comparable or outperforms in any case, either with or without some elements (i.e., UU, DD, UD, and DU elements).

In order to observe the performance of each algorithm in each subject, we produce the area under curve (AUC) in Tables 2, 3, and 4 with Dataset number 1, Dataset number 2, and Dataset number 3, respectively. If the total of AUC is close to value 1, then it means that the classification of that particular algorithm is desirable. These results are related with those in Table 1. We can view from Tables 2 and 3 that SADD consists of highest value of AUC. Therefore, SADD has shown the highest performance with Dataset number 1 and Dataset number 2 in Table 1. However, Table 4 shows that median vector proximity has highest value of AUC, and this is the reason why median vector proximity has highest performance with Dataset number 3 in Table 1. From Table 3, it is worthwhile to note that although median vector proximity shows overall higher accuracy, SADD has more wins (i.e., 20 wins) than median vector proximity has (i.e., 19 wins) for Dataset number 3.

Besides step 9 in the testing phase from Algorithms 1 and 2, we comment that we can use the statistical summary information such as minimum, maximum, mean, median, or mode when we calculate the checking score. In order to show the effectiveness of the statistical metric in the experiment, we provide the results with different statistical metrics in the form of the average of the EER with its standard deviation in Table 5. Interestingly, the result with mean metric has the higher performance than other statistical metrics such as median or mode. Furthermore, our proposed algorithm shows the highest performance in three datasets.

In addition, we operate an experiment with different number of the bins (different number of the categories) to test the effectiveness of the number of bins toward our proposed algorithm and the sequence alignment algorithm with static divisor. We show the result in Table 6. The statistical metric that we used to operate in this experiment is mean. Table 6 indicates that 20 bins or 30 bins have almost the same performance. However, depending on the dataset, we have to use different number of bins in order to produce optimal performance. As the results shown in Table 6, Dataset number 1 and Dataset number 2 have to use 20 bins to perform the highest results for SADD algorithm, but 30 bins have to be used in Dataset number 3 for SADD algorithm in order to obtain the highest results. Meanwhile, 30 bins are much appropriate to be used with SA algorithm in Dataset number 1 and Dataset number 3 and 20 bins, on the other hand, are much suitable for Dataset number 2.

In Revett’s research [7], he shows that the sequence alignment algorithm has performed effectively when it is applied into the keystroke dynamics. He also provides the steps of applying sequence alignment in the keystroke dynamics. However, there is still much more to improve as we have described in Section 3. Therefore, we have proposed our method named sequence alignment with dynamic divisor generation (SADD) algorithm which checks the degree of sufficiency of the dataset and provides a proper divisor to each attribute. Our result proves that SADD has higher performance than the sequence alignment algorithm using static divisor in every attribute.

Al-Jarrah [6] has proposed the median proximity vector in his paper. He provides a very good performance of this algorithm by using median instead of mean in his experiment. In our proposed algorithm, we apply various statistical metrics in the experiments to find the most appropriate measure.

Giot and Rosenberger’s [3] research introduces a new soft biometric for keystroke dynamics based on gender recognition. Besides that, the similar study from Idrus et al. [4] also introduces more valuable information such as the number of hands used (i.e., one hand or both hands), age, and the dominant hand (i.e., left or right). This extra information can be used as reference in order to help to improve the performance of algorithms. Although we do not include this extra information into our experiment due to the limitation information that we can obtain from the CMU benchmark dataset, we believe that this extra information will be beneficial in the future work.

Syed et al. [8] show the concept of event sequences used in the keystroke dynamics. These event sequences help in distinguishing the typing behavior of a user. Most keystroke dynamics use key-down and key-up without actual key values. However, introducing too many dimensions can cause the curse of dimensionality. Therefore, it is interesting to extend our algorithm to accommodate these event sequences in the future work.

7. Conclusion

In this paper, we have proposed sequence alignment with dynamic divisor generation (SADD) for user authentication by using the keystroke dynamics. Based on the experiments we have conducted, our algorithm produces promising results and also mostly outperforms other previous work. We also empirically show that the dynamic divisor generally outperforms static divisor. We believe that the dynamic divisor takes an important role in sequence alignment algorithm because it calculates the degree of sufficiency of the dataset (by using mean of Horner’s rule) and then it provides faultless calculation for an appropriate divisor to be used in each attribute. These dynamic divisors help to prevent the genuine user’s data digressed from the legal categories.

Based on Giot and Rosenberger’s [3] research, they introduce a new soft biometric for keystroke dynamics based on gender recognition. They have done interesting work by introducing this new information in keystroke dynamics. Furthermore, Idrus et al. [4] also introduce more valuable information such as the type of hands used (i.e., one hand or both hands), age, and the dominant hand (i.e., left or right). However, they just study the effectiveness of using the information. Therefore, it will be interesting to apply this extra information to our future study. We also like to discover more valuable information besides the information we have discussed above.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (no. NRF-2013R1A1A2013401).