Abstract

CVSS is a specification for measuring the relative severity of software vulnerabilities. The performance values of the CVSS given by CVSS-SIG cannot describe the reasons for the software vulnerabilities. This approach fails to distinguish between software vulnerabilities that have the same score but different levels of severity. In this paper, a software vulnerability rating approach (SVRA) is proposed. The vulnerability database is used by SVRA to analyze the frequencies of CVSS’s metrics at different times. Then, the equations for both exploitability and impact subscores are given in terms of these frequencies. SVRA performs a weighted average of these two subscores to create an SVRA score. The score of a vulnerability is dynamically calculated at different times using the vulnerability database. Experiments were performed to validate the efficiency of the SVRA.

1. Introduction

The common vulnerability scoring system (CVSS), developed and maintained by the CVSS special interest group (CVSS-SIG) working under the auspices of the forum for incident response and security teams (FIRST), can be applied to the classification of security vulnerability [1] and the analysis of attack models [2]. CVSS has been adopted by many software vendors and service providers [3]. The US federal government uses it for its National Vulnerability Database [4] and mandates its use in products validated by the security content automation protocol (SCAP) program.

There exist many proprietary schemes for rating software flaw vulnerabilities, most are created by software vendors, but CVSS is the only known open specification. In contrast to other scoring systems, CVSS was designed to be quantitative so that analysts would not have to perform qualitative evaluations of vulnerability severity. Great effort has been directed at developing the specification for CVSS so that any two vulnerability analysts should obtain identical CVSS scores for the same vulnerability. The scores are based on a series of measurements (called metrics) based on expert assessment.

1.1. Overview of CVSS Framework

CVSS provides an open framework for describing the characteristics and impacts of IT vulnerabilities. It contains three groups of metrics (see Figure 1), as explained in [5, 6].

(1)Base. It represents the intrinsic and fundamental characteristics of a vulnerability that are time-constant across user environments. An equation is applied to the values of the base metrics to compute a vulnerability’s base score.(2)Temporal. It represents the characteristics of a vulnerability that change over time but apply to all instances of a vulnerability in all environments, such as the public availability of an exploit code or a remediation technique. A temporal score for a vulnerability is calculated with an equation that uses both the base score and temporal metric values as parameters.(3)Environmental. It captures the characteristics of a vulnerability that are associated with users IT environment. Since environmental metrics are optional, they each include a metric value that has no effect on the score. An environmental score is calculated with an equation that uses both the temporal score and the environmental metric values as parameters.

The initial CVSS specification was developed by the National Infrastructure Advisory Council and published in October 2004 [7]. During the analysis and use of the original CVSS version, many deficiencies were found, as explained in [8]. Finalized in 2007, the current version (CVSS v2) was designed to address these deficiencies. The base metric group of CVSS v2 has two subscores.(1)Exploitability subscore , composed of the access vector (), access complexity (), and authentication instances (), is computed by the following equation: (2)Impact subscore , which expresses the potential damage on confidentiality (), integrity (), and availability (), is computed as follows:

Table 1 gives all possible values of the six base metrics in v2, which are used to calculate these two subscores. The overall base score of v2 is expressed in terms of impact () and exploitability () components by

The base score is rounded to one decimal place and ranges from 0.0 to 10.0. More details related to CVSS metrics and their scoring computation can be found in the CVSS guide [5].

1.2. Shortcomings of CVSS v2

We downloaded 54,432 vulnerabilities listed in the common vulnerabilities and exposures (CVE) dictionary [9]; this encompasses all valid CVE entries published between 2002 and 2012. The scoring was performed by the national vulnerability database (NVD) [4] in accordance with the v2 specification.

When scoring separates vulnerabilities, they should be scored completely independently of each other and not take into account any interaction. According to v2 scoring tip number 1:

“Vulnerability scoring should not take into account any interaction with other vulnerabilities. That is, each vulnerability should be scored independently.”

SPSS, a type of statistical software, was used to perform the Chi-square analysis, and the results indicate there exist correlations among the six metrics. For example, Table 2 shows the statistical data related to the frequencies of and . These data were used as inputs by SPSS, and the results of the Chi-square analysis are given in Table 3. Asymp. Sig. is smaller than the significance level 0.05, indicating that the correlation between and is significant. Similarly, there exist significant correlations between other metrics, for example, and .

Most importantly, CVSS v2 fails to distinguish different vulnerabilities. As an example, a path disclosure flaw in a web application would be scored as ,  ,  ,   for a total score of 5.0. A vulnerability that allows an attacker to traverse the file system and read any file accessible by the web server would receive the same score as the path disclosure flaw. These two flaws obviously pose significantly different risks, yet according to CVSS v2 standards, they are identical.

Once the v2 metrics were defined, opinions related to the scoring for each type of vulnerability were collected from the CVSS-SIG members and their organizations. Each of the six metrics had three possible values, resulting in 729 possible vulnerability types. It was not possible to create scores for these 729 types in the range in a justifiable manner. So, the researchers divided the base metrics into two subgroups: impact and exploitability. Each group had three metrics with three possible values, so only 27 vulnerability types per group had to be scored and ranked. The researchers reached consensus on the approximated rankings and scorings, leading to the creation of lookup tables for impact and exploitability. The CVSS score was computed by a weighted average of exploitability and impact. However, the CVSS community desired an equation instead of lookup tables. So, mathematicians proposed equations (1)–(3) to approximate the lookup tables. In essence, these equations were derived from the designers’ experience and statistical results of vulnerability data. As time went on, it became clear that these equations, as well as the empirical values listed in Table 1, might no longer be applicable.

1.3. Our Design Methodology

To overcome the shortcomings mentioned above, a software vulnerability rating approach (SVRA) is proposed. SVRA takes time as an important parameter. Based on a vulnerability database, it counts the frequencies of the six metrics at any given time point. Then, the three values of each metric are given by their frequencies. As the frequencies change over time, each metric takes different values instead of a constant value.

The process of exploiting a vulnerability is a step-by-step procedure, but the impact is an evolutionary and accumulative process. So, the frequency of the vector is used to approximate the exploitability, while the frequencies of , , and are utilized to calculate the impact. To create an SVRA score from these two subscores, SVRA also performs a weighted average of exploitability and impact, with exploitability having a weight of 0.4 and impact having a weight of 0.6. In terms of design methodology, SVRA is fundamentally different from CVSS v1 and CVSS v2. The score of a vulnerability in SVRA dynamically changes over time, which is not true in v2 or v1.

The rest of this paper is organized as follows. The next section provides the framework of the SVRA. Section 3 describes the analysis of and comparison between CVSS and SVRA, and the experimental results are also reported. Section 4 summarizes our conclusions and highlights some suggestions for future work.

2. Software Vulnerability Rating Approach

This section provides the framework of our software vulnerability rating approach (SVRA), where the base score of a vulnerability is dynamically calculated over time.

2.1. Frequency

Let be the time domain. Obviously, all vulnerabilities have their own report times. Given , let , a vulnerability database, be all vulnerabilities whose report time is less than or equal to , and which contains 18 elements. Then, for , a subset is defined as The frequency of at is denoted as and it can be computed as follows: where denotes cardinality of set .

Note that the report time is unique, and it represents the cut-off point at which a vulnerability belongs to the database or not. So, the report time is chosen as the benchmark to rank the vulnerability database. The other time parameters, such as modified date, cannot rank the database effectively (a vulnerability possibly might not have a modified date, for example).

Figure 2 shows the frequency curves of the six basic metrics at time point for different values. For the exploitability metrics, the three curves of , , and have a higher position in their coordinate systems. This shows that a vulnerability falling into the group is more vulnerable. Overall, the curves of exploitability metrics are divergent, while the curves of impact metrics are convergent.

Similarly, for , and , the frequency of at time can be calculated as follows: where . If , .

2.2. Exploitability Score

Consider the correlations among , , and ; the equation for exploitability subscore is defined by the following probability: where is the probability measure. In contrast to v2, we use probability to define the exploitability score instead of the magic numbers in Table 1. The probability also includes the time point as its parameter, and it can be given by the root of solution of its frequency:

Because the frequencies of the 27 vulnerability types of exploitability metrics are not of the same order of magnitude, is used to approximate the probability instead of . So, there must exist the smallest positive integer such that

The basic idea of this equation is that the minimum and nonzero value is close to 1/27 when the range is divided into 27 classes. Since these subscores are normalized to the range , the coefficient of (8) can be determined by

For the database , the number of each exploitability type is listed in fourth column of Table 4. As can be seen, the value and , so, . Note that , and

From (8), the exploitability subscores of SVRA can be calculated; the results are listed in the fifth column of Table 4. CVSS v2 has 9 exploitability vectors with a score of 0, while SVRA has only one. The theoretical distributions of exploitability subscores for both SVRA and CVSS v2 are shown in Figure 3. From a theoretical viewpoint, v2 subscores have much less diversity than SVRA subscores. Figure 4 shows how exploitability subscores change over time for two vulnerability types: and . For SVRA, the exploitability subscore may increase, decrease, or remain unchanged. Table 5 lists the and values from 2002 to 2012. When the amount of vulnerability data increases, the changes for and are minor. This indicates that (8) has better stability and can dynamically compute when changes over time.

2.3. Impact Score

The impact caused by a vulnerability varies. Ideally, the sum of all categories of impact can be used to measure the impact subscore . However, there exist correlations among the impact metrics , , and , so the equation for impact subscore is defined as where and

From the frequency curves of , , and in Figure 2, the approximate convergence value 1/3 is chosen as the coefficient of . Let ; this definition makes use of an idea similar to the inclusion-exclusion principle, and the parameter can be determined by the following equation:

For database , the computing results for are listed in the fourth column of Table 6. As can be seen, , so . By (13), the impact subscores of SVRA can be calculated, and they are listed in the fifth column of Table 6. CVSS v2 has one impact vector with a score of 0, while SVRA has none. For SVRA, the mean for the theoretical score is 7.9 and the median is 8.1; the standard deviation is 1.58 and the skew is . This represents a significant change from v2, which has a mean of 7.0, a median of 7.8, a standard deviation of 2.53, and a skew of . The theoretical distributions of the impact subscores for both SVRA and CVSS v2 are shown in Figure 5. From a theoretical viewpoint, v2 subscores have much less diversity than SVRA subscores.

Figure 6 shows the curves of the two variables and from 2002 to 2012. When vulnerability data increases, the changes for and are minor. This indicates that (13) has better stability and can dynamically compute when changes over time. Figure 7 shows how impact subscores change over time for two impact types: and . For SVRA, the impact subscore can increase, decrease, or remain unchanged when changes over time.

2.4. Base Score

Given time and a vulnerability , the new equation of base score is as follows:

Figure 8 shows a comparison between SVRA and CVSS v2. Some vulnerability types have high SVRA scores but low v2 scores, and others have high v2 scores but low SVRA scores. We examined the theoretical distributions of SVRA and CVSS v2 scores; see Figure 9. For SVRA, the mean for the theoretical scores is 5.9, the median is 5.8, the standard deviation is 1.34, and the skew is 0.22. This represents a significant change from v2, which has a mean of 4.7, a median of 4.9, a standard deviation of 2.20, and a skew of . This illustrates that SVRA has superior numerical normality and stability. Figure 10 illustrates the changing trends of two vulnerability types (Type 1 and Type 2), as well as the mean of the SVRA base scores from 2002 to 2012. The CVSS v2 base scores of Type 1 and Type 2 are 6.6 and 5.6, respectively. However, the SVRA base score of Type 1 first decreases and then settles in the range . The of Type 2 increases by 1.1 from 4.9 to 6.0. At first, the rise of the mean is rapid, and then the increase slows down. So, when new vulnerabilities are added into the database , many of become more and more serious. In real life, two or more vulnerabilities may be combined to form a critical issue. The Google Chrome Pwnium full exploits are excellent examples, in which strings of vulnerabilities are combined into a full sandbox escape, resulting in arbitrary code execution. So, these curves reflect the fact that vulnerabilities interact with each other.

3. Experimental Analysis and Comparisons

This section describes our experimental analysis of the SVRA and CVSS v2 base scores for 54,432 vulnerabilities listed in CVE. Figure 11 shows a comparison between SVRA and CVSS v2. For the v2 experimental scores, the mean is 6.3, the median is 6.8, the standard deviation is 2.02, and the skew is . This represents an increase of 1.6 in the mean and 1.9 in the median from the theoretical data. Approximately 58.43% of the scores are above 5.0, 16.58% are at 5.0, and 24.98% are below 5.0. For the SVRA experimental scores, the mean is 8.1 and the median is 8.0. This represents an increase of 2.2 in both the mean and median from the theoretical data. The standard deviation is 1.26 and the skew is . Of the scores, approximately 99.09% are above 5.0, 0.21% are at 5.0, and 0.69% are below 5.0. This is consistent with the CVSS-SIG's goal to have the majority of scores above 5.0.

The national vulnerability database (NVD) [4] generates a base score for each vulnerability and then assigns a ranking based on the score. The rankings are Low (0.0 to 3.9), Medium (4.0 to 6.9), and High (7.0 to 10.0) [10]. The motivation for having these rankings is to help organizations prioritize their mitigations of new vulnerabilities. Table 7 lists the comparison results of several vulnerabilities among the four authoritative security organizations (Secunia in Denmark, FrSIRT in France, ISS X-Force in the USA, and CVSS). As can be seen, SVRA coincides with the majority of the organizations. For the vulnerabilities CVE-2007-1497 and CVE-2007-2242, SVRA adjusts the CVSS rankings from High to Medium.

We also performed rankings for the theoretical data, as shown in Table 8. There exists a dramatic change in the CVSS v2 and SVRA, with SVRA having more Medium and High vulnerabilities but fewer Low vulnerabilities.

There are times when a vulnerability is scored as a 0.0 by CVSS v2 standards. These are often vulnerabilities that do pose some threat, albeit a limited one. Nevertheless, if the issue is considered a vulnerability by the industry, this should be reflected through the assignment of a real score. One such example is arbitrary site redirection. Per current CVSS v2 scoring rules, this would yield ,  ,   with an SVRA score of 5.4.

4. Conclusions

The CVSS empirical values given by CVSS-SIG cannot distinguish software vulnerabilities that have identical scores but different severities. In this paper, a software vulnerability rating approach (SVRA) is proposed based on a vulnerability database. With the SVRA, the frequencies of CVSS metrics are analyzed at different times. The equations for both exploitability and impact subscores are given in terms of these frequencies. To create an SVRA score, SVRA performs a weighted average of these two subscores. As the frequency changes over time, each metric takes different values instead of the constant empirical value. The score of a vulnerability is dynamically computed at different time points using the vulnerability database. The theoretical and experimental results illustrate the efficiency of the SVRA.

Although the SVRA was developed for the base metric group, the approach can be extended to the temporal metric group and the environmental metric group. Further work will include predicting whether vulnerability severity changes so much over time that future modifications to the SVRA may be needed.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the Funds NSFC61171121 and the Science Foundation of Chinese Ministry of Education—China Mobile 2012.