#### Abstract

In loss distribution approach (LDA), the most popular approach in operational risk modeling, frequency dependence and loss distribution dependence across business lines are two dependences which banks should consider. In practice, mainly for simplicity, many banks only model frequency dependence although they think that the impact of frequency dependence is insignificant. In this study, two approaches, respectively, models frequency dependence and loss distribution dependence, are introduced. Both approaches are modeled by copula function, which is capable of capturing nonlinear correlation. Based on the most comprehensive operational risk dataset of Chinese banking as far as we know, the operational risk capital charge of the overall Chinese banking is calculated by the two approaches. The results show that there is an obvious distinction between the capital calculated by modeling frequency dependence and the capital calculated by modeling loss dependence. The approach with very limited attention exactly yields a much larger capital result. So it is advised in this paper that banks should not just rely on the approach to modeling frequency dependence for it is natural and easy to deal with. A safer and more effective way for banks is to comprehensively take the results of the two kinds of approach into consideration.

#### 1. Introduction

The sudden collapse of Barings Bank alerts the banking industry to operational risk, that is why, following credit risk and market risk, operational risk has become the third risk covered by Basel Accord II. Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes, people, and systems or from external events [1, 2]. Basel Accord II proposes basic indicator approach (BIA), standardized approach (SA), and advanced measurement approach (AMA) to calculate operational risk capital. Among the three approaches, AMA is the most sophisticated and much more risk sensitive than the former two approaches, so the Basel Committee on Banking Supervision (hereafter BCBS) encourages banks to develop and use the AMA within operational risk management [3]. Many approaches such as loss distribution approaches (LDA) and extreme value theory (EVT) have been developed for operational risk modeling so far [4, 5].

Among all kinds of AMA, LDA, firstly developed by actuarial industry and then introduced to operational risk modeling by Frachot et al. [6], is widely accepted as the most popular approach in calculating the capital charge for operational risk [4, 7–9]. LDA is a parametric technique that consists in separately estimating a frequency distribution for the occurrence of operational losses and a severity distribution for the economic impact of individual losses. With these two distributions, the bank then computes the probability distribution of the aggregated operational loss. Basel II Accord categorizes the operational risk event into eight business lines, that is, corporate finance, trading and sales, retail banking, commercial banking, payment and settlements, agency services, asset management, and retail brokerage [3]; however, the traditional LDA does not involve the dependence across business lines.

A more reasonable way to calculate capital charge for operational risk is to incorporate the dependence between different business lines into LDA. In this kind of LDA, separate distributions for loss frequency and severity are modeled at business line level and then combined [10]. In this basic model, three kinds of dependences might occur and so should be taken seriously, that is, frequency dependence, severity dependence, and loss distribution dependence [11, 12]. Frequency dependence is the dependence between the occurrences of loss events. In practice it means that we observe that, historically, the number of (say) commercial banking events is high (resp., low) when the number of (say) corporate finance events is also high (resp., low). Severity dependence is the dependence between the sizes of individual loss events. Loss distribution dependence is the dependence between the yearly losses of business lines [11].

Many approaches modeling dependence in LDA have emerged so far. As to frequency dependence, there is a contradiction. On one hand, the approaches modeling the frequency dependence are probably most popular in the banking industry. On the other hand, many researches hold that frequency correlation has only a very restricted impact on economic capital for operational risk [11, 13–15]. For example, the world leading bank in risk management, Deutsche Bank, only specifies the dependence of frequency distributions in their LDA model although they think that the impact of frequency dependence is insignificant [11]. The main reasons responsible for this contradiction are that correlation between the yearly numbers of loss event within business lines can be quite easily calculated from empirical loss data and adding correlation between frequencies of events is quite an easy task and does not destroy the very nature of the LDA model. Therefore this type of correlation can be taken into account at minimum cost [12]. Compared with frequency dependence, loss distribution dependence has been paid very limited attention. Chapelle et al. [9] and Giacometti et al. [16] directly model the dependence of losses through the use of copulas in order to combine the marginal distributions of different business lines into a single joint loss distribution. Severity dependence necessarily alters, in a substantial extent, the basic foundations of the standard LDA model and requires building an entirely new family of models [12]. So it is scarcely studied in the literature and will not be discussed in this study as well.

The objective of this study is to discuss whether the popular frequency dependence modeling is enough for the banks when deciding to allocate operational risk capital. Firstly, two approaches, respectively, modeling frequency dependence and loss dependence, in the framework of LDA are introduced. When it comes to modeling dependence, copula function has attracted broad attention because it is capable of capturing the complete dependence structure inherent in a random vector in contrast to linear correlation [17–19]. According to the survey of BCBS [15], dependence is introduced into the modeling process mainly by use of copulas. So, in this study, copula is employed to model both frequency dependence and loss dependence.

Then both approaches are employed to calculate the operational risk capital charge for Chinese banking. Our laboratory has been devoted to collecting operational risk data of Chinese banking for about 10 years. Up to now, totally 2132 records are collected in our dataset, ranging from 1994 to 2012. To the best of our knowledge, this operational risk dataset is the most comprehensive one in China. In the experiment, both approaches are employed to calculate the operational risk capital of Chinese banking based on this dataset.

The rest of this paper is organized as follows. Firstly, Section 2 introduces the two approaches, respectively, model frequency dependence and loss dependence. Then Section 3 employs the two approaches to calculate the operational risk capital charge of the overall Chinese banking. Section 4 summarizes the conclusions.

#### 2. Approaches

In this section, two approaches that are capable of modeling frequency dependence and loss dependence in the framework of LDA are, respectively, introduced. Both dependence relations in the two approaches are modeled by copula functions. By using the two approaches, aggregate operational loss distribution can be simulated from Monte Carlo simulation. Then VaR is used to measure the operational risk capital charge. Pioneered by J.P. Morgan, VaR has become a standard measure used in financial risk [20]. VaR at a specific confidence level is defined as the smallest number such that the probability of loss *L* exceeding *l* is not larger than :

It is noteworthy that frequency refers to the number of loss events occurring a year, severity refers to the size of an individual loss event, and loss refers to the total size of all the loss events a year.

##### 2.1. LDA with Frequency Dependence (LDA-FD)

It is observed that, historically, the number of (say) commercial banking events is high (resp., low) when the number of (say) corporate finance events is also high (resp., low). This phenomenon implies that there is underlying correlation or dependence between the frequencies of business lines.

In this approach, we model the dependence structure of the frequencies in the framework of LDA and this approach is then named LDA-FD here. The whole procedure of LDA-FD approach consists of 3 stages and 9 steps in total. In Stage 1, the marginal frequency distribution and marginal severity distribution are fitted, and the copula used to modeling the dependence structure should be specified. Usually, the goodness-of-fit test is used to determine the best fitting distributions and copula. In Stage 2, Monte Carlo simulation is used to derive the aggregate loss distribution, and in Stage 3, VaR is calculated from the aggregate loss distribution. Assume the number of Monte Carlo simulation is *N*. The specific steps of LDA-FD approach are given as follows.

###### 2.1.1. Stage 1: Preparation

*Step 1*. Choose the best fitting distribution for the frequency of eight business lines.

*Step 2*. Choose the best fitting distribution for the severity of eight business lines.

*Step 3.* Choose the proper copula to describe the dependence structure of frequencies.

###### 2.1.2. Stage 2: Simulation

*Step 4.* Draw a joint sample of uniform random variables from the specified copula .

*Step 5*. Translate the sample from copula into a sample of frequency .

*Step 6.* Simulate losses from randomly, …, and simulate losses from .

*Step 7.* Sum all the simulated losses from Step 6 to derive aggregate loss .

*Step 8*. Repeat Step 4 to Step 7 times to derive simulated aggregate losses .

###### 2.1.3. Stage 3: Calculation

*Step 9.* Calculate VaR from aggregate loss .

##### 2.2. LDA with Loss Dependence (LDA-LD)

In this approach, a higher level of dependence is taken into account. We directly model dependence between the losses of different business lines. This kind of aggregation is just like the “top-to-down” aggregation of different risk types [21, 22]. Here, we model dependence structure of losses in the framework of LDA and this approach is named LDA-LD. The number of Monte Carlo simulation is also set as . The whole procedure of LDA-LD approach consisting of 3 stages and 7 steps in total is as follows.

###### 2.2.1. Stage 1: Preparation

*Step 1*. Choose the best fitting distribution for losses of eight business lines.

*Step 2*. Choose the proper copula to describe dependence structure of losses.

###### 2.2.2. Stage 2: Simulation

*Step 3*. Draw joint sample of uniform random variables from the specified copula *. *

*Step 4.* Translate the sample from copula into a sample of loss .

*Step 5*. Sum all the losses from Step 4 to derive simulated aggregate loss .

*Step 6.* Repeat Step 3 to Step 5 times to derive simulated aggregate losses .

###### 2.2.3. Stage 3: Calculation

*Step 7.* Calculate VaR from aggregate loss .

#### 3. Application to Chinese Banking

In this section, the introduced approaches LDA-FD and LDA-LD are used to calculate the operational risk capital charge of the overall Chinese banking. Firstly the dataset is introduced. Then some details of the experiment are explained. At last the results are presented and analyzed. All computations in this experiment are performed by R 3.0.1.

##### 3.1. Data Description

The collection of operational risk data is still in a preliminary stage in the world and this case is even worse in Chinese banking. Our laboratory has been devoted to collecting operational risk data from publicly available information sources, such as newspapers, Internet, and court filings for almost 10 years. Up to now, totally 2132 records are collected in our dataset, ranging from 1994 to 2012. For each record, the loss event description, start time, end time, exposed time, business line type, loss event type, loss amount in CNY, banks involved, location, key person involved, and so on are collected in exact detail. To the best of our knowledge, this operational risk dataset of Chinese banking is the most comprehensive one in China. In the experiment, the end time, loss amount in CNY, and business line type are used.

In line with Basel II Accord, the operational risk event can be categorized in 8 business lines, that is, corporate finance, trading and sales, retail banking, commercial banking, payment and settlements, agency services, asset management, and retail brokerage, which are correspondingly called BL1 to BL8 for short in the following text. Figure 1 presents the frequency of these business lines in the experiment dataset. It can be seen that the occurrence of operational risk event mostly appears in BL3, BL4, and BL5. So, in the experiment, we only consider the aggregation of BL3, BL4, and BL5.

##### 3.2. Experiment Design

Operational risk is characterized by “leptokurtosis and fat tail” [15], so lognormal distribution, exponential distribution, Pareto distribution, gamma distribution, Weibull distribution, generalized hyperbolic distribution (GHD), generalized error distribution (GED), skewed generalized error distribution (SGED), and generalized Pareto distribution (GPD) are usually employed to fit the loss or severity distributions [4, 6, 8, 9]. GHD, GED, and SGED are used to fit the natural logarithm of data and the rest are directly employed to fit the data. As for frequency distribution, Poisson, negative binomial, and geometric distributions are frequently used [1, 9, 15]. In some studies, Poisson distribution is employed to model frequency without any test for its ease of use and the viewpoint of its limited impact on capital calculation [6, 11]. In this study, we are ready to use Kolmogorov-Smirnov goodness-of-fit test (KS test for short) to find out which distribution can fit the frequency, severity, and yearly loss best. Besides, all the parameters of the distributions in this experiment are estimated by maximum likelihood method.

Among all types of copulas, frequently used copulas include Gaussian copula and *t *copula from elliptical copula family and Gumbel copula, Clayton copula, and Frank copula from Archimedean copula family. For a detailed introduction of these copulas, please see [23, 24]. Compared with elliptical copulas, some Archimedean copulas are very sensitive to asymmetric tail dependence. However, their highly symmetric structures prevent their use in more than two dimensional applications [21]. Therefore, here in this experiment we choose Gaussian copula and *t *copula to describe the dependence structure of yearly frequencies and yearly losses. The formulations of the two copulas are presented in the appendix. We use the goodness-of-fit test proposed by Kojadinovic and Yan [25] to determine which copula can describe the dependence of the empirical data best.

As for the Monte Carlo simulation, the larger the number of simulations is, the more accurate the aggregate loss distribution is and the longer the simulation time is. In order to balance accuracy and time, the times of Monte Carlo simulation are set as 100 thousand, in line with most of the studies.

##### 3.3. Experiment Results

###### 3.3.1. Results from LDA-FD

In this section, LDA-FD, the approach modeling frequency dependence, is used to calculate the operational risk capital charge for the overall Chinese banking. Firstly, we will find the best fitting distributions for frequency and severity by KS test. Poisson, negative binomial, and geometric distributions are used to fit the yearly frequency. The KS test results reveal that negative binomial distribution outperforms the other two distributions on all the three business lines. For space considerations, only the results of negative binomial are presented in Table 1. For KS test, the larger the *P* value is, the better the distribution can fit the data, and the confidence level is generally set as 5%. Table 1 shows that *P* value for BL3 is 0.59, for BL4 is 0.87, and for BL5 is 0.93, which are all much larger than 5%. The parameters estimated by maximum likelihood methods for each marginal frequency distribution are also shown in Table 1.

As described above, lognormal distribution, exponential distribution, Pareto distribution, gamma distribution, Weibull distribution, GHD, GED, SGED, and GPD are used to fit the severity data. The KS results show that SGED is the best fitting one for each business line. The results of SGED are shown in Table 2. The *P* values for each business lines in KS test are 0.62, 0.90, and 0.82, respectively, which are also much larger than 5%. The estimated parameters for each marginal severity distribution are also shown in Table 2.

Then, we should choose a proper copula that is capable of accurately capturing the dependence structure of yearly frequencies of different business lines. The goodness-of-fit test results are shown in Table 3. In this test, also, the larger the *P* value is, the better the copula can fit the empirical data, and the confidence level is generally set as 5%. From Table 3 we can see that test result of Gaussian copula is , which is smaller than 5%. As for *t *copula, *t *copula with degree of freedom equal to 1 outperforms *t *copula with other degrees of freedom. The test result of *t *copula with degree of freedom equal to 1 is , which is larger than 5%. So finally *t *copula is chosen to describe the yearly frequencies dependence. The parameters of the specified *t *copulas are also shown in Table 3. These parameters are estimated from maximum likelihood method.

After the marginal frequency distribution, marginal severity distribution and copula are specified. The capital charge for operational risk of the overall Chinese banking can be calculated by using the LDA-FD approach presented in Section 2.1. For risk management purpose, confidence level is generally set to be larger than 90%. So the VaRs at confidence level 90%, 95%, 99%, 99.9%, and 99.97% are presented in Table 4. Table 4 shows that, as the confidence level increases, the VaR and economic capital become larger. The expected loss here is 14 billion. Economic capital, defined as the difference between VaR and expected loss and so referred to as the “unexpected loss,” is also presented. Therefore, by considering the frequency dependence, the overall operational risk of Chinese banking is 231 billion CNY at 99.9%. The corresponding operational risk regulatory (economic) capital is 217 billion CNY.

###### 3.3.2. Results from LDA-LD

In this section, we will use LDA-FD, the approach modeling yearly loss dependence, to calculate the operational risk capital charge for Chinese banking. Firstly, we should find out the best fitting distribution for marginal yearly loss of each business line. Also, lognormal distribution, exponential distribution, Pareto distribution, gamma distribution, Weibull distribution, GHD, GED, SGED, and GPD are employed to fit the empirical yearly loss data of each business line. KS test is used to select the distribution that can best fit the data. It turns out that lognormal distribution is the best fitting distribution for all the marginal loss data. In consideration of the length of this paper, Table 5 only presents results of lognormal distribution. We can see that the *P* value of KS test for each business line is 0.32, 0.46, and 0.59, respectively, all much larger than 5%. This means that lognormal distribution fits each marginal loss data well. The parameters estimated by maximum likelihood method are also presented in Table 5.

Secondly, we should find a proper copula to describe the dependence structure of yearly losses of these business lines. The goodness-of-fit test results, presented in Table 6, show that both copulas have passed the goodness-of-fit test. value is 0.31 for Gaussian copula and 0.93 for *t *copula, both larger than 5%. Nevertheless, because the larger the *P* value is, the better the copula fits the data, we decide to use *t *copula to describe dependence structure of yearly loss data in the Monte Carlo simulation. The *P* value for *t *copula is as high as 0.93, which means that *t *copula fits the yearly loss dependence very well. The parameters of the specified *t *copula, estimated from maximum likelihood methods, are also shown in Table 6.

After the marginal loss distribution and copula are specified. The aggregate loss can be simulated by the LDA-LD approach presented in Section 2.2. The results at different confidence levels are presented in Table 7. The expected loss is 20 billion here. Table 7 shows that, as the confidence level increases, the VaR and economic capital become larger. Therefore, by considering loss dependence, the overall operational risk of Chinese banking is 675 billion CNY at 99.9% and the corresponding regulatory (economic) capital is 655 billion CNY.

###### 3.3.3. Results Comparison and Analysis

In order to further analyze the results of the two approaches, we decide to compare them with the results of BIA. Banks using the BIA must hold capital for operational risk equal to the average over the previous three years of a fixed percentage of positive annual gross income. The fixed percentage is set to 15% by the BCBS. For a more detailed formulation, please see BCBS [1] and BCBS [3]. As shown in Table 8, the gross income of listed banks is 1756 billion in 2010, 2234 billion in 2011, and 2596 billion in 2012. So, according to BIA, the operational risk capital charge of listed commercial banks is 15% billion CNY. Only the gross income of listed commercial banks is available, so we roughly estimate the capital of the overall Chinese banks as follows. The total assets of listed banks account for 67% of that of overall Chinese banking in 2010, 66% in 2011, and 65% in 2012, so the capital charge of overall Chinese banking is calculated by 329.3/66%, that is, 498.9 billion CNY.

Conventionally, the capital requirement is set to protect against losses over one year at the 99.9% level because it is roughly equivalent to the default risk of an A-rated corporate bond [6]. BCBS also recommends 99.9% as a proper confidence level. Therefore, we aim at comparing the VaRs from the two approaches at 99.9% with result of BIA. All these results are presented in Table 9. Table 9 shows that operational risk capital of Chinese banking calculated by BIA is 499 billion. The result of LDA-FD approach is 231 billion, which is smaller than the result of BIA. On the contrary, the result of LDA-LD approach is 675 billion, which is larger than the result of BIA. As mentioned in Section 1, nowadays, in practice, most of the banks just consider frequency dependence in their AMA. In this case of Chinese banking, it turns out that this kind of approach, LDA-FD here, indeed gives a much smaller capital than BIA. In contrast, modeling yearly loss dependence is paid less attention. However, LDA-LD gives a larger capital than BIA.

There is an obvious distinction between the results of the approach to modeling frequency dependence and the results of the approach to modeling loss dependence. The approach to modeling loss dependence, with less attention, exactly yields a much larger result. Because of the lack of data, although many studies focus on operational risk modeling, very few studies give specific operational capital charge. As far as we know, Feng et al. [8] are the only researchers who give a specific operational risk capital charge for the overall Chinese banking. Their results show that operational risk capital of overall Chinese banking is about 234 billion CNY in 2008. Our results show that LDA-FD gives the capital charge of 231 billion CNY and LDA-LD gives the capital charge of 675 billion CNY in 2013. Chinese banks have developed very fast in recent years, so generally operational risk in 2013 ought to be larger than operational risk in 2008. However, results of LDA-FD, 231 billion in 2013, are even smaller than 234 billion concluded from Feng et al. [8] in 2008. The comparison results also support our conclusions that results of LDA-FD might be too small.

Therefore, it is advised in this paper that banks should not just rely on the result of modeling frequency dependence for it is natural and easy to deal with. A safer way to allocate capital is to comprehensively take both results into consideration. Combination estimation, a widely used method, has been successfully introduced to the field of operational risk modeling by Feng et al. [8]. Here an easy linear combination of the results of LDA-FD approach and LDA-LD approach is to calculate their mean, which is 453 billion CNY. This result is considered safer and more effective for banks.

#### 4. Conclusion

In this paper, we present two ways of modeling dependence in the framework of LDA. The approach modeling frequency dependence based on copula is named LDA-FD approach and the approach modeling yearly loss dependence is named LDA-LD approach. We then apply the two approaches to calculate operational risk capital of the overall Chinese banking on the most comprehensive operational risk dataset in China as far as we know.

The operational risk capital of the overall Chinese banking calculated by LDA-FD approach is 231 billion CNY, which is about half of the result of BIA, that is, is 499 billion CNY. Contrastively, the operational risk capital calculated by LDA-LD approach is 675 billion CNY, which is larger than the result of BIA. There is an obvious distinction between the results of LDA-FD and the results of LDA-LD. The approach with less attention, that is, the approach to modeling loss dependence, exactly yields a much larger result, so it is advised in this paper that banks should not just rely on the approach to modeling frequency dependence for it is natural and easy to deal with. A safer and more effective way for banks is to take the results of both approaches into consideration.

#### Appendix

#### A. Some Distributions and Copulas in This Study

##### A.1. Lognormal Distribution

The lognormal distribution is one of the standard distributions used in insurance and operational risk to model large claims and operational risk loss. The probability function of lognormal is where and are the two parameters of lognormal distribution, denoting the mean and standard deviation of logarithmic .

##### A.2. Skewed Generalized Error Distribution (SGED)

The probability density function of SGED is where and are the mean and standard deviation of , is a skewness parameter, “sgn” is the sign function, and is the gamma function. Scaling parameters and obey constraints and . For a more detailed introduction of SGED, please see [8].

##### A.3. Negative Binomial Distribution

Negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified (nonrandom) number of failures occur. The probability function of the negative binomial distribution is where is a specified number of failures and is the probability of success in each trial.

##### A.4. Gaussian Copula and *t *Copula

Assume follow univariate standard uniform distribution. The Gaussian copula is presented as where denotes the standard multivariate Gaussian distribution function with correlation coefficient matrix and denotes the inverse function of standard univariate Gaussian distribution. Gaussian copula is very popular at first in practice due to the common assumption of normality in many financial modeling applications and ease of use and understanding. However, it does not allow for tail dependence, which is a major drawback and as such becomes a less favorable candidate for capital applications [26].

The *t *copulais presented as
where denotes the distribution function with degree of freedom and correlation coefficient matrix and denotes the inverse function of distribution function. We are currently seeing growing literature on the usefulness of the *t *copula as an alternative to the Gaussian copula for modeling financial risks. The main impetus for the *t *copula’s rise to notoriety is associated with its ability to incorporate tail dependence [27]. Theoretically, the lower the degree of freedom, the heavier the tail dependence for a *t *copula. Gaussian copula is in fact a limiting case of *t *copula as the degree of freedom approaches . By contrast, *t *copula exhibits the heaviest tail dependence as the degree of freedom approaches 1.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This research has been supported by Grants from the National Natural Science Foundation of China (71071148 and 71301087), Key Research Program of Institute of Policy and Management, Chinese Academy of Sciences, and Youth Innovation Promotion Association of the Chinese Academy of Sciences.