Abstract

We discuss a parameter estimation problem for a Gaussian copula model under misspecification. Conventional estimators such as the maximum likelihood estimator (MLE) do not work well if the model is misspecified. We propose the estimator that minimizes the projective power entropy. We call it the -estimator, where denotes the power index. A feasible form of the projective power entropy is given that suites the Gaussian copula model. It is shown that the -estimator is robust against outliers. In addition the -estimator can appropriately detect a heterogeneous structure of the underlying distribution, even if the underlying distribution consists of some different copula components while a single Gaussian copula is used as a statistical model. We explore such an ability of the -estimator to detect the local structures in the comparison with the MLE. We also propose a fixed point algorithm to obtain the -estimator. The usefulness of the proposed methodology is demonstrated in numerical experiments.

1. Introduction

Applications of copula models have been increasing in number in recent years. There are a variety of applications on finance, risk management [1] and multivariate time series analysis [2]. With copula models, the specification of the marginal distributions is parameterized separately from the dependence structure of the joint distribution. Hence, it gives a convenient way of the construction of flexible and more general multivariate distributions. As far as we know, there exist only a few works that are tackled with the identification and the statistical estimation of the mixture of copula models and most of them rely on MCMC algorithm. In this paper we focus on a misspecified Gaussian copula model. In other words, a sample follows a distribution mixed with different sources but a statistical model we fit is just a single Gaussian copula. It is very hard to construct multivariate copulas for three or more random variables [3], while the Gaussian is an exception. So we start with the Gaussian copula model, but later in Section 5 we will show that our method is closely related to -copula. As an example of misspecification, we consider that the underlying distribution is where is a mixing proportion and denotes the probability density function of the Gaussian copula with the correlation matrix parameter . We see that the MLE for almost surely converges to under the assumption (1), which means that the MLE fails to detect the structure of the underlying distribution.

We make use of the -estimator [4, 5] that can be obtained via minimization of the projective power entropy. Here denotes the power index, and if , the -estimator reduces to the MLE. So the -estimator can be regarded as an extension of the MLE. In [5], the robustness of the -estimator was investigated in a general setting of parametric model. In [6], the minimum density power divergence estimator was proposed, which also uses power of density, for the covariance matrix of multivariate time series, and the robustness was shown. Our research shows that even if a single Gaussian copula model is incorrectly fitted to the data from the mixture distribution (1), the -estimator can detect both and separately if and are “distinct” enough and is close to 0.5.

The -estimation for the Gaussian copula model relies on the projective power cross entropy between the underlying distribution and the Gaussian copula model . The projective power cross entropy, which is a function of , has only one local minimum or some local minima depending on the underlying distribution. We show that if and are “distinct” enough and is near 0.5, then the projective power cross entropy between the underlying mixture distribution (1) and the Gaussian copula has two local minimizers near and , respectively, so we propose to use these local minimizers to detect and .

This paper is organized as follows. The -estimator and the MLE for the Gaussian copula model and a fixed point algorithm to obtain the -estimator and the MLE are given in Section 2. Section 3 states the relationship between the projective power entropy and the -estimator. We introduce an appropriate measure for the Gaussian copula model since the projective power entropy is defined with respect to some carrier measure. Section 4 reveals the property of the -estimator to detect heterogeneous structures. Section 5 elucidates the relationship between maximum entropy distributions and the -estimation. The robustness of the -estimator is discussed based on its influence function in Section 6. A simulation study is given in Section 7, and discussions are given in the last section. The proofs for all the theoretical results are provided in the appendix.

2. Estimation of the Gaussian Copula Model

In Section 2.1, the -estimator for the Gaussian copula model is discussed, and in the followed subsection the MLE for the Gaussian copula model is given. The last subsection lays out a fixed point algorithm to obtain the -estimator and the MLE.

2.1. The -Estimator for the Gaussian Copula Model

The density function of the Gaussian copula is given by where , denotes the cumulative distribution function of the standard normal distribution, is a correlation matrix, and is the identity matrix of size . Let be the -dimensional vector which consists of the column-wise stacked lower diagonal elements of . For example, if . The set is a parameter space of the Gaussian copula model.

Let be a random sample from a copula with the probability density function while is our statistical model. The loss function associated with the projective power entropy introduced in Section 3.1 is given by up to constant, where for . The -estimator is proposed as the set of local minimizers of and interpreted as follows. If has a local minimum, the underlying distribution is estimated by using the minimizer . If has local minima , the underlying distribution is estimated by a mixture of Gaussian copulas. Each Gaussian copula’s parameter is estimated by the corresponding local minimizer.

2.2. The MLE for the Gaussian Copula Model

We consider the MLE for the Gaussian copula model on the same setting as in Section 2.1. The log-likelihood multiplied by is given by It is easy to see that and satisfy up to constant, so the MLE will be deemed to be the 0-estimator in terms of the -estimator. Generally the -estimator can be regarded as an extension of the MLE. It is well known that the MLE does not work well under model misspecification. For example, in the case of (1) the MLE for the Gaussian copula model almost surely converges to , but we cannot detect neither nor . If and then is equal to the identity matrix, which has no meaning in this situation. We cannot use the MLE in the case of misspecification.

2.3. A Fixed Point Algorithm to Obtain the -Estimator for the Gaussian Copula Model

We give a fixed point algorithm to obtain the -estimator for the Gaussian copula model using the Lagrange-multiplier method. The appendix provides the details of the derivation of the algorithm. We can still make use of this algorithm to obtain the MLE just by setting .

Algorithm

(1) Set an appropriate correlation matrix .

(2) Given , calculate by the following update formula: where denotes the Hadamard product. is defined by where Here Diag  for a square matrix denotes the column vector which consists of the diagonal elements of and diag for a vector denotes the diagonal matrix whose diagonal elements are the components of .

(3) For sufficient small given number , repeat procedure 2 while where for a square matrix denotes the matrix norm defined by .

(4) For all local minimizers, repeat procedures 1–3 for different initial values .

If we consider the estimation problem on Gaussian distributions with mean 0, the update formula for an iteration algorithm to obtain the -estimator of the covariance matrix is given by See [5] for details. If we consider the optimization problem with the objective function without the constraint that the diagonal elements of are 1, the same iteration algorithm (12) can be deduced. So the second term of the right hand side of (8) appears because of the existence of the constraint.

We make a remark on the algorithm to obtain the MLE, or -estimator with . We find rather complicated solution of the MLE if we consider a simpler case of . In [1], an approximate MLE for the Gaussian copula model is shown because it takes quite a while to solve the constrained optimization problem in order to obtain the MLE in high dimensions. The approximate MLE is given by where is the diagonal matrix whose diagonal elements are equal to those of . We can easily consider an iteration algorithm to obtain an approximate -estimator to combine (12) and (13). The update formula of the algorithm is given by where If is infinity, and converge to the same correlation matrix when tends to . However and are different in general. is preferred to in terms of accuracy.

3. Projective Power Entropy and -Estimator

In Section 3.1 the projective power entropy and the -estimator are given. In the next subsection we discuss an appropriate measure for the Gaussian copula model.

3.1. Projective Power Entropy and the -Estimator

The projective power entropy of with the index and the measure is defined as If is the Lebesgue measure denoted by and is a probability density function, then we have where is defined by , which is equivalent to the Boltzmann-Shannon entropy. The projective power cross entropy between and with the index and the measure is defined as The projective power divergence is given by satisfies , and if and only if , so can be seen as a kind of distance between and .

Let , be a random sample from a probability density function and a statistical model. Since we want to find the closest distribution to in the model , we want to find the minimizer of , which is equal to the minimizer of . If has the Radon-Nikodym derivative , then is equal to where Empirically the projective power cross entropy can be estimated by which is called the loss function. Note that The original -estimator is defined by the minimizer of the loss function : Note that we are not necessarily seeking to the global minimizer. Rather, we allow the loss function to be multimodal, so we refer the -estimator to the set of the local minimizers. See [4, 5] for details of the -estimator.

3.2. Choice of the Carrier Measure

In calculating the -estimator, the measure can be determined by each user. Here we propose, for Gaussian copula models, the use of a measure, denoted by , of which Radon-Nikodym derivative is given by , where is the Jacobian of the transformation . From now on we refer this choice to and explain its rationale by virtue of invariance.

We assume that , where denotes the probability density function of the -dimensional Gaussian distribution with mean and correlation matrix . Let ; then ~ . If the underlying distribution of is , then ~ , where is given by . It is noteworthy that the projective power cross entropy between and based on is not always equal to the projective power cross entropy between and based on . So the -estimator based on does not coincide with the -estimator based on .

It is natural for us to require the equivalence of the two -estimators, and therefore we employ the measure . It is striking that the projective power cross entropy between and calculated under the measure is equal to the projective power cross entropy between and calculated under the Lebesgue measure , which is proportional to Obviously there is equalization of the two -estimators. Note that the loss function associated with cross entropy (26) becomes (4).

The argument above extends to a general statement. For given one to one transformation , denotes the inverse function of and denotes the Jacobian of the transformation . Any nonnegative functions , satisfy if and only if the Radon-Nikodym derivative of is equal to . When and are the probability density functions, to consider the projective power cross entropy on under the Lebesgue measure is equal to consider the projective power cross entropy on under the measure having as its Radon-Nikodym derivative.

4. Property of the -Estimator

The -estimator for the Gaussian copula model under infinite sample size is equal to the set of the local minimizers of . In this section we leave aside the empirical loss function for the moment and investigate the property of the -estimator (at infinity) through . First we consider the case where there is no misspecification.

Theorem 1. If , then has the local minimum at .

In this case we note that the -estimator is equal to , which implies Fisher consistency. For asymptotic properties the -estimator has asymptotic consistency and normality.

Next we consider the misspecification case where the true data generating process is given by (1). We see that which is proportional to is a weighted mean of the two projective power cross entropy. Each component is a unimodal function, bounded above by 0 and has one local minimum at and , respectively. So we expect that has two local minima and these local minimizers are near and , respectively, if and are sufficiently “distinct.” However it is hard to formulate such a phenomenon mathematically so we show through easy examples and a graph that such a phenomenon occurs. To obtain numerical solutions, we use the expected (or population) version of the algorithm in Section 2.3.

Example 2. In the case with dimension 2, is a univariate function of , which is the nondiagonal element of . Let and be , and . If     , then has two local minima in the interval and , respectively.

Example 3. Suppose the true correlation matrices and are given as follows, and stands for the parameterization of the statistical model we fit: We also set and . Note that is a function of and . Figure 1 shows the graph of . We can see there exist two local maxima at and .

Example 4. Suppose and . When , and are given by Then has two local minima at
Like these examples, has some local minima depending on the underlying distribution. Owing to this property we can detect the heterogeneous structures of the underlying distribution under misspecification.

5. Maximum Entropy Distribution

So far we have considered the -estimation of the Gaussian copula model. In this section we uncover that the choice of copula model can be characterized in terms of the maximum entropy distribution. In this regard, the most closely related work is in [2], in which the MLE on meta--distribution is addressed. A -copula is deduced from a multivariate -distribution while the meta--distribution is constructed by linking a -copula to univariate -distributions as its marginal distributions. In our framework, [2]’s work can be interpreted as the maximum likelihood estimation of -copulas with the marginals estimated simultaneously. Actually the -estimation of Gaussian copulas and the maximum likelihood estimation of -copulas look very similar and share a common idea.

In [4], it is analyzed what the maximum projective power entropy distributions would be under the given (population) mean vector and covariance matrix. The answer depends on the power index . When , the Gaussian distribution emerges as the maximum projective power entropy distribution. If , the -distribution comes up. We show that a similar result holds for copulas. Let be the cumulative distribution function of the -distribution with degrees of freedom , and ,  . We suppose that . Let be the set of probability density functions on which satisfy the following equation: Let denote the probability density function of -distribution with degrees of freedom and correlation matrix , and let denote the probability density function of its copula (-copula): Then we see If , then and . So we see that That is, -copula can be characterized as the maximum projective power entropy distribution on . Moreover it has limiting equivalence (by letting ) with the Gaussian copula which is tagged with the maximum Boltzmann-Shannon entropy distribution. We call these maximum projective power entropy copulas the -copulas. Let us consider the relationship between the -copula and the -estimation. Our method is discussed on the pair of the Gaussian copula (0-copula) and -estimator. On the other hand [2] discussed on the pair of -copula model () and the MLE (0-estimator). We see a sort of duality relationship between two choices of the pair.

6. Robustness

In this section we examine robustness of the -estimator for the Gaussian copula model through its influence function. The influence function measures the asymptotic bias caused by contamination at the . The boundedness of the influence function means boundedness of the influence from the outlier, hence its robustness. The influence function of the -estimator is given in Section 6.1. We show that it is bounded when . In the next subsection, a brief simulation is performed.

6.1. Influence Function

The -estimator for the Gaussian copula model can be regarded as a functional of a distribution defined by Let be Then the influence function of the -estimator is given by where . See [7] for details. The boundedness of the influence function is equivalent to the boundedness of . The following theorem gives a bound of .

Theorem 5. When , that is, for the MLE, the influence function is not bounded. When , the influence function is not bounded. When , the influence function is bounded and a bound is given by where denotes the Kronecker product and for an -dimensional vector denotes the Euclidean norm defined by .

For example, if is equal to , then .

6.2. Simulation

This subsection describes the results of Monte Carlo simulations carried out in order to examine the robustness of the -estimator for the Gaussian copula model. We generate 500 pseudorandom samples of size 500 from distribution where is equal to the independent copula and is given by For each sample, we calculate the -estimator for the Gaussian copula model with and the MLE for the Gaussian copula model. We use the norm as the accuracy measure. Table 1 shows the root mean squared error (RMSE) of the norm for the -estimator and MLE. We can see that the norm for the -estimator is less than that for the MLE, so we see that the -estimator is more robust than the MLE.

7. Simulation Study

The property of the -estimator to detect heterogeneous structures is investigated by a bunch of simulations. A comparison of the -estimator with the MLE for a mixture Gaussian copula (1) is also discussed.

7.1. Simulation Setup

We conducted two kinds of simulation.

Simulation 1. The underlying distribution was constructed based on the one factor Gaussian copula model [8]. Suppose where have independently the standard normal distribution. Then we see , , where satisfies Let the underlying distribution be (1), where and are made from the one factor Gaussian copula model. This model means that the dependence structure is expressed by the mixture of Gaussian copulas. Set . is made with and with Then we have

The -estimator for the Gaussian copula model with is investigated. Initial values of which is used in calculating the -estimator are , , where is the correlation matrix whose component () is equal to . If the -estimator has two components and such that then is thought of as an estimator of and denoted by . Similarly for and denoted by .

We adopt the MLE for a mixture Gaussian copula model (1). Although and are the correlation matrices, we tentatively view them to be the covariance matrices and use EM-Algorithm to obtain an approximate MLE. The obtained estimators and are not necessarily the correlation matrices, so they are transformed into the correlation matrices by which is denoted by for . The initial value of which is used in calculating the MLE is set to .

A set of data of size () was generated from (1), and the norm of , , , and were calculated. 500 simulations were carried out, and then, we calculated the RMSE of the norm based on 500 norm values obtained by simulation. The results are shown in Table 3.

Simulation 2. Suppose that the underlying distribution is where and are the same in Simulation 1. The other settings are the same as in Simulation 1. The results are shown in Table 5.

7.2. Result

Result of Simulation 1. Table 2 shows the ratio for the -estimator to detect two correlation matrices. For nearly 80 percent was successful, and for it worked out almost perfectly. From Table 3, the MLE had better performance than the -estimator. However this is natural because the MLE is used under no misspecification.

Result of Simulation 2. Table 4 shows the ratio for the -estimator to detect two correlation matrices. Compared to the result of Simulation 1 the detection rate at gets worse while at the result is almost alike in Table 2. From Table 5, we find the MLE is considerably underperforming and the -estimator is much better.

8. Discussion

We have considered an estimation problem for misspecified Gaussian copula model. By the simulation study our methodology has been found to work well for misspecification. Though we did not consider how to determine the value of , this problem was considered in [9] for independent component analysis. It could be possible to follow their method in our problem, but it is currently a future problem.

We choose the measure in terms of invariance. However the -estimator obtained is equal to the estimator with normal distribution as a statistical model, so it seems natural. If we use Lebesgue measure in calculating the -estimator for Gaussian copula model, we cannot calculate the projective power entropy for all the value of and .

Another issue is to what extent the methodology here works for time series data. Because the basic premise of this paper is that we have data as quantiles, our method would fit, for example, the modeling of unconditional loss distribution [1, page 28]. Such a case is of particular interest when the time horizon over which we measure our losses is relatively large. When we are working on the conditional modeling, our method should be regarded as a tool for the post analysis. As a typical case, we may want to apply our mixture copula approach to multivariate log-return series which are appropriately standardized and declustered by the multivariate GARCH model fitted to them. See [2] for more details.

Appendices

A. Derivation for the Algorithm

We derive the estimation equation for . Since is symmetric and positive definite, there exists a matrix of size which satisfies . The th diagonal element of is expressed by , where is the -dimensional column vector whose th element is 1 and the other elements are 0. Since the diagonal elements of are equal to 1, Lagrange function becomes where is Lagrange multiplier. We differentiate (A.1) with respect to with the technique in [10]. The differential of , which is defined in [10, Sections and ], is where diag is the diagonal matrix whose diagonal elements are . From Table in [10, Chapter 9] we have Set the derivative of (A.1) to ; then we have Multiply from the left side of (A.4); then (A.4) becomes where From the constraint about the diagonal elements of , we have In general, for any square matrices and of size and -dimensional column vector , we have So (A.7) becomes Then we have and use this estimation equation as an update formula.

B. Proof of Theorem 1

We see that Consider a monotone transformation of the right hand side of (B.1) to obtain For any , let and define by We see Let be the Kullback-Leibler divergence between and . It is well known that and equal to 0 if and only if . So for , we have If we read as and , then we see that . The proof is complete.

C. Proof of Theorem 5

If we see It is obvious that is not bounded with respect to . Next if , then let , where is a symmetric matrix. Set ; then Express in polar coordinate; then where , , . Hence If and , then we see is not bounded. Next if , we see Since , , where vec denotes the vec operator. In addition we observe Since we see

D. Proof of the Statement -Copula Is the Maximum Projective Power Entropy Distribution with

We see The numerator of (D.1) becomes for . Hence