#### Abstract

A Gram-Charlier distribution has a density that is a polynomial times a normal density. For option pricing this retains the tractability of the normal distribution while allowing nonzero skewness and excess kurtosis. Properties of the Gram-Charlier distributions are derived, leading to the definition of a process with independent Gram-Charlier increments, as well as formulas for option prices and their sensitivities. A procedure for simulating Gram-Charlier distributions and processes is given. Numerical illustrations show the effect of skewness and kurtosis on option prices.

#### 1. Introduction

Gram-Charlier series are expansions of the form where is the usual normal density and is the Hermite polynomial of order . The expression within square brackets in (1) is an orthogonal polynomial expansion for the ratio ; given an arbitrary function the expansion may or may not converge to the true value of . In this paper we focus on the properties of the Gram-Charlier distributions, obtained by truncating the series after a finite number of terms. What is obtained is a family of distributions parametrized by , as is explained in detail below.

This paper has three main goals: (1) define and study the properties of the family of Gram-Charlier distributions; (2) define a Gram-Charlier process and derive its basic properties; (3) apply those to European options. The formulas we give for European option prices and Greeks apply to Gram-Charlier distributions of any order, and we use four- and six-parameter Gram-Charlier distributions in our examples. Two numerical illustrations show how option prices are affected by the skewness and kurtosis of returns. This paper can a reference for those using Gram-Charlier distributions in option pricing but also in statistics.

Most previous applications to option pricing have assumed that . We believe this restriction is not necessary in a general theory of Gram-Charlier distributions, as there may well be situations where the extra degrees of freedom given by and will be useful. We discuss it in detail at the end of Section 2.

It has been observed that option prices have nonconstant implied volatilities, meaning that log returns do not have a normal distribution under the risk-neutral measure. There is a wide literature on modelling log returns to fit observed option prices, the main alternatives to Brownian motion being stochastic volatility models (where the parameter in Black-Scholes is replaced with a continuous-time stochastic process), GARCH time series, and Lévy processes. Gram-Charlier distributions are mathematically simpler than the models just mentioned, while allowing a better fit to data than the normal distribution. Several authors have used Gram-Charlier distributions in option pricing, as a model for log returns, among others [1–10]. The majority of previous authors assumed a density of the form for the normalized log return. (In our notation this is a distribution; see Section 2.) The notation emphasizes that in this case the coefficient of turns out to be Pearson’s skewness coefficient divided by 6, and the coefficient of the excess kurtosis coefficient divided by 24. The distribution of log returns is then a four-parameter Gram-Charlier distribution (since there are two other parameters, the mean and variance, besides and ). This distribution allows nonzero skewness and excess kurtosis, unlike the normal distribution found in Black-Scholes. In (3) the parameters are restricted to a specific region (see Figure 3), because outside that region the function in (3) becomes negative for some values of (see Section 2.1).

Jurczenko et al. [8] specify the martingale restriction that the four-parameter Gram-Charlier density must satisfy in pricing options (previous authors had not taken it into account). The martingale condition for general Gram-Charlier distributions is described in Section 5. Our Gram-Charlier distribution with parameters , denoted by , has density As stated above we do not set , and can be any even positive integer. The question whether (5) is nonnegative for all is an important one. Several authors have disregarded this issue and, in fact, some have come up with parameters that do not yield a true probability density function. One might argue that this is the price to pay for truncating an infinite Gram-Charlier series. However, if the same log return distribution is used to price many options, then a true probability density function is the only safe choice, because otherwise there might be inconsistencies among option prices. For instance, if the function used as density is negative over the interval , then a digital option that pays off only when the log return is in that interval will have a negative price. In this paper we consider Gram-Charlier* distributions*, not* expansions*, and insist that the densities integrate to one and be nonnegative. Our goal is to define a family of proper probability distributions; nevertheless, our formulas do apply to truncated Gram-Charlier expansions as well.

The paper by León et al. [11] presents an alternative to the general Gram-Charlier distributions we study in this paper. Those authors consider the subclass of Gram-Charlier distributions consisting of densitieswhere the polynomial is the square of another polynomial, . This has the obvious advantage that the nonnegativity restriction on is automatically satisfied. We discuss that subclass of “squared” Gram-Charlier distributions in the Conclusion.

Almost all previous authors have used Gram-Charlier distributed log returns over a single time period. This has an obvious downside, in that it becomes tricky, if not impossible, to preserve consistency between the prices of options with different maturities. Section 3 shows that a Lévy process with Gram-Charlier increments does not exist; however, it also shows that the sum of independent Gram-Charlier distributed variables also has a Gram-Charlier distribution. This opens the way for multiperiod Gram-Charlier option pricing, using a discrete-time random walk model for which the log return over any period has a Gram-Charlier distribution. The Gram-Charlier distribution of the multiperiod return has a larger number of parameters, though the model is still simpler than almost any (if not all) alternative stochastic volatility models. There is no problem computationally, since we give explicit formulas for options under Gram-Charlier distributions with an arbitrary number of parameters.

The layout of the paper is as follows. In Section 2, we extend the study of Gram-Charlier distributions to all possible polynomials and derive their properties (moments, cumulants, moment determinacy, properties of the set of valid parameters, tail, and so on). Some of the formulas and properties in Theorem 2 appear to be new. In Section 3 we show that there is no Lévy process with Gram-Charlier distributed increments, apart from Brownian motion, and we define a discrete-time process with independent Gram-Charlier increments that is suitable for option pricing. In Section 4 we show that the log Gram-Charlier distribution is not determined by its moments, just like the lognormal. Next, Section 5 gives formulas for European call and put prices when the log-price returns of the underlying have a general Gram-Charlier distribution; the previous literature mostly considered the family and the squared Gram-Charlier distributions (an exception is [10], where it is assumed that ). In particular, we derive a change of measure formula that extends the Cameron-Martin formula for the normal distribution; the latter is used in pricing European options in the Black-Scholes model. We also derive formulas for the sensitivities (Greeks) of those option prices with respect to all parameters. A technique for simulating Gram-Charlier distributions is described. Parts (c), (d), (i), (j) (k), and (l) of Theorem 2 and part (b) of Theorem 3 and Theorems 6, 7, 9, and 11 appear to be new. Theorem 3(a) is for the first time formulated for general Gram-Charlier distributions. Theorems 3(a), 4, and 8 are given, more or less explicitly, for the subclass of squared Gram-Charlier distributions in [11], and some of the Greeks in Theorem 9 had been calculated for the distribution by previous authors.

In Section 6 we give two applications that show how option prices depend on skewness and kurtosis of the log returns (this of course cannot be done in the classical Black-Scholes setting, while it appears quite complicated to do so in stochastic volatility or Lévy driven models). The first example is equity indexed annuities (EIAs in the sequel) premium options. The pricing of EIAs has been studied by many authors, including Hardy [12], Gaillardetz and Lin [13], and Boyle and Tian [14]. The second example is lookback options, which illustrates the use of the simulation method in Section 2.5. Lookback options have been studied by numerous authors, but there is no closed form formula for the price of discretely monitored lookback options within the Black-Scholes model. We refer the reader to Kou [15]. In those examples all parameters are estimated by maximum likelihood, with the range of the parameters restricted to where they correspond to a true probability distribution. Moment estimation is of course possible when , but it is not trustworthy, because of the range restriction on the parameters ; for instance, the empirical skewness and excess kurtosis have a positive probability of falling outside the feasible region (see Section 6). Theorem 2(f) shows that the first moments of the Gram-Charlier distribution depend on more than parameters, which probably rules out moment estimation, unless one a priori fixes two of the parameters (which again is not trustworthy). We apply maximum likelihood to the six-parameter in Section 6, but we do realize that maximum likelihood estimation for higher order Gram-Charlier distributions poses computational problems, which are an interesting avenue for further research. (A referee pointed out that fixing and would make estimation easier; our guess is that if one wishes to simplify maximum likelihood estimation then fixing and might be a better idea than fixing and , though we have not looked at this in any depth. A two-step process might be imagined, whereby the data first give information about and , and then maximum likelihood is applied using that preliminary information. This would help by constraining the optimization to a smaller region, while agreeing with the intuitive idea that and are location and scale parameters, resp.)

*Notation 1. *The normal density function is denoted asand its distribution function is Two equivalent versions of the Hermite polynomials may be found in the literature: for , The first one is common in mathematics and physics, but in probability and statistics there is an obvious advantage in using the second one. (The conversion formula is ) The first few Hermite polynomials are

#### 2. Gram-Charlier Distributions

For a fixed , consider the class of distributions that have a pdf of the formwith . Noting that the leading term of is , we conclude that must necessarily be even, because if were odd then the polynomial that multiplies would take negative values for some . For the same reason cannot be negative.

*Definition 1. *Let , , , , and . We write (or ) if the variable has probability density functionThis will be called a* Gram-Charlier distribution* with parameters , with . The largest such that is called the* order* of the Gram-Charlier distribution. The normal distribution with mean and variance is a (or ) with order 0.

The class of Gram-Charlier distributions just defined includes all distributions with densitywhere is a polynomial of degree , since can be rewritten as a combination of The condition ensures that function (12) integrates to one, since There are no simple conditions that ensure that a polynomial remains nonnegative everywhere, though in some cases precise conditions on are known; see below. If a vector leads to a true Gram-Charlier pdf, then we will say that is* valid*.

Generating functions are convenient when dealing with orthogonal polynomials. One isAnother one is Letting leads to This proves the orthogonality of the Hermite polynomials and gives us the value of which is essential in deriving Gram-Charlier series. Since this also impliesAnother formula is the Laplace transform of , which may be found by integrating by parts times:Let us calculate the moments of a distribution with density (11). First considerIntegrating by parts repeatedly yieldsHence, the th moment of the distribution in (11) is This says in particular that the parameter only affects the moments of order and higher of the distribution (see part (i) of Theorem 2). This is confirmed by the moment generating function, which may be found from (21): It can be checked that differentiating this expression times and setting give the same expression found for the th moment of (11).

Theorem 2. *Suppose , with , , and . The order of the distribution is necessarily even.**(a)**(b)**(c) The representation of the distribution in terms of the parameters , , and is unique.**(d) All Gram-Charlier distributions are determined by their moments.**(e) The set of valid in includes the origin, is not reduced to a single point, and is convex.**(f) The first six moments of the distribution are **(g) The first six cumulants of the distribution are **(h) The following hold for any distribution:**(i) Suppose , . Then the first moments of and are the same; that is,if and only if , . This implies that if up to are equal to 0 then the distribution has zero skewness and excess kurtosis; hence, this shows how to construct an infinite number of distributions that share this property with the normal.**(j) Suppose . Then When the skewness and excess kurtosis coefficients of are and , respectively, for any .**(k) If and is a constant then , where , . In particular, .**(l) The law of the square of is a combination of chi-square distributions with degrees of freedom that has density *

*Proof. *Parts (a) and (b) were proved above, and (c) follows directly from (b). To prove (d) it is sufficient to note the existence of for in an open neighbourhood of . For (e), if then the distribution is the standard normal, and this is a Gram-Charlier distribution. Since and is even, the polynomialtends to when tends to . Hence, there is such that(recall that for all Gram-Charlier distributions). The set of valid vectors thus includes for each . If , are valid then is also valid, for any .

Part (f) is found by expanding the mgf in (b) as a series in around the origin. Part (g) is found by expanding The formulas in (h) follow (f) and (g).

For part (i), it is sufficient to consider the case , only. Writeand calculate the moments of and by successively differentiating the mgf’s of and (note that ). The first moments of and are the same if and only ifSuppose that , . Then , , and thus the first moments of and are the same. Conversely, suppose . Then the first identity above implies that , since is not zero. The second identity implies that , and so on, up to .

Turning to (j), the first equivalence is an immediate consequence of property (i) with . To prove the second equivalence, suppose , and let . Then Here and ; hence, if and only if . The last equality is . For , with and being arbitrary, this means that For (k), observe that if then the mgf of is Finally, turn to (l). Routine calculations show that the density of the square is The Hermite polynomials of odd order are odd functions and so disappear from that expression, while the even order Hermite polynomials are even functions.

When the distribution is the normal distribution with mean and variance . However, part (b) of the theorem says that when the parameters and are not necessarily the mean and variance of the distribution. Simple calculations lead to the following result.

Theorem 3. *(a) If then **(b) The tails of the distribution are *

The tails of the Gram-Charlier distributions are thicker than those of the normal distribution but are still “thin” because they are in the limit smaller than any exponential function .

##### 2.1. The Family

Here the exact region for the that lead to a true probability distribution has been found. This goes back to Barton and Dennis [16], but a more detailed explanation is given in Jondeau and Rockinger [6]. The region is shown in Figure 3 (use the correspondence , from Theorem 2(h)). An important fact about this region is that it is not rectangular; the possible excess kurtosis values depend on skewness, and conversely.

##### 2.2. Distributions of Order 4 and Higher

The family has six parameters, rather than four, and thus has more degrees of freedom in fitting data; to the authors’ knowledge the general six-parameter Gram-Charlier distribution has been used in financial applications by León et al. [11] only (those authors use the subfamily consisting of polynomials that are squares of some second-degree polynomial). Schlögl [10] fits the six-and eight-parameter families and to data.

The set of that yield true probability distributions has not been identified, but it is possible to fit the six parameters by maximum likelihood.

##### 2.3. Why We Do Not Assume That

Almost all previous authors have assumed that , because they used normalized data:(see part (j) of Theorem 2) as explained below. In this section we explain why it is important not to restrict Gram-Charlier distributions or series in that way. The first reason for not doing so is that enlarging the parameter space can only be a good thing. The second one is that in fitting those distributions to data there may be very real advantages in letting and be different from zero. The only downside of letting and take nonzero values, and it is of no real importance, is that and lose their simple relationship with skewness and excess kurtosis (see part (h) of Theorem 2). A third reason is that after an exponential change of measure a Gram-Charlier distribution will rarely have ; see Section 2.4.

We now show, using an example that can be worked out explicitly, that it is not always best to use normalized data when fitting Gram-Charlier distributions, because choosing another affine transformation of the data may well yield a much better fit.

The “standard” Gram-Charlier expansion for a function iswhere A classical result about Hermite series, proved by Cramér [17], is that sufficient conditions for the Gram-Charlier expansion (47) to converge to for all are that (i) has finite variation in every bounded interval, and (ii) satisfies If has density , then the last condition is . This condition cannot be improved upon, in the sense that there are cases where the Gram-Charlier series defined above diverges, although for all (it will be shown below that one such case is the normal distribution with mean 0 and variance 2).

Let us first calculate Using the generating function (16) we find The Gram-Charlier expansion corresponding to the density is thusIt is possible to determine whether this converges or not when ; the Gram-Charlier series for is From Stirling’s formula, as tends to infinity, which implies that there are three possibilities: expansion (54) (i) converges absolutely if , (ii) converges simply if , and diverges if . (The preceding calculations are from Cramér [17].)

Let us now consider a random variable , , with density In words, has a distribution with probability and a distribution with probability . Define the normalized variable The coefficients of the Gram-Charlier expansion for are We have shown that (53) diverges if . Here we see that involves with and ; in the second case there is no problem, as for all . In the first case, This condition is satisfied for smaller than , and so the Gram-Charlier expansion for diverges for all . This is a sad state of affair: a series designed to work for distributions that are “close to the normal” that fails for combinations of two normal densities!

There is an easy solution: use a different scaling for . Rather than multiplying by , use a factor such that the series converges. In this example we know that the series converges if and only if . Say we choose ; this implies Figure 1 shows how the true density for is approximated by order 10 and 20 Gram-Charlier truncated series if is chosen; the graph is not surprising, the infinite Gram-Charlier expansion diverges. Figure 2 shows the same except that is set to . The latter is smaller than , so the infinite expansion converges. In this example the density to be approximated is symmetrical about 0, so is always 0, while

##### 2.4. Exponential Change of Measure

The one-dimensional Cameron-Martin formula may be stated as follows: if then for and , The same property may be expressed in terms of a change of measure. If and a change of measure is defined bythen . The next result is an extension of the Cameron-Martin formula for the normal distribution to the Gram-Charlier distributions. There are different though equivalent formulas in Schlögl [10] (for the special case ).

Theorem 4. *Suppose that and that is defined by (63) for . Then , where are found from or, more precisely, *

*Proof. *It is sufficient to calculate the mgf of under :

*Example 5. *If and the change of measure (63) leads to Hence, assuming does not imply that and are also zero. In other words, the family is not closed under a change of measure that occurs naturally in option pricing (see Section 5).

##### 2.5. Simulating Gram-Charlier Distributions

Simulation is required for many kinds of options, and it turns out that the Gram-Charlier distributions are very easy to generate, as we now show; there is no need to invert their distribution functions. When estimating some quantity by simulation, one generates independent vectors with the same distribution. Suppose all the ’s are independent and have the same Gram-Charlier distribution with density where the polynomial is given in (5). Then where under the measure the ’s have a normal distribution . Hence, estimating by simulation can be performed by generating ordinary normal random variables , and then using the estimator This is an application of the likelihood ratio method.

#### 3. Convolution of Gram-Charlier Distributions; Gram-Charlier Processes

The simplest way to find the distribution of the sum of two independent Gram-Charlier distributed variables is to multiply their moment generating functions. Suppose , are independent. Then Expanding the product, this says thatIt is then possible to know the explicit distribution of where the ’s are independent and have a Gram-Charlier distribution, constituting a discrete-time Gram-Charlier process. If the ’s have the same distribution then is a random walk. The derivation of the distribution of can be done recursively, using (73), or it can be done by finding the Taylor expansion of and thus These computations are simple using symbolic mathematics software. An example is given at the end of Section 6.1.

The above raises the question of whether there is a continuous-time process that has Gram-Charlier distributed increments. There is such a process with normal increments (Brownian motion), and it is moreover a Lévy process.

Theorem 6. *The only Lévy process with Gram-Charlier distributed increments is Brownian motion.*

*Proof. *It is sufficient to show that, besides the normal distribution, any Gram-Charlier distribution cannot be infinitely divisible. If has a Gram-Charlier distribution then its moment generating function is of the form where is a polynomial. Suppose are independent, have the same distribution, and add up to (in law). Fubini’s theorem implies that is finite for all real , and thus is an analytic function of in the whole complex plane. We then have the identity for every . This means that the function , which is well defined for all that is not a zero of , has an analytic continuation over the whole of for each , which is impossible, for instance, take larger than the degree of .

More precisely, this says that if we exclude Brownian motion, no increment of any Lévy process can have a Gram-Charlier distribution. Any Gram-Charlier process with independent increments must be discrete-time.

#### 4. The Log Gram-Charlier Distribution

The distribution of the exponential of a Gram-Charlier distributed variable will naturally be called “log Gram-Charlier”, as we do for the lognormal: if , then . The density of is This distribution has all moments finite, and they are given by Theorem 2(b). The log Gram-Charlier distribution shares one property with the lognormal, it is “moment indeterminate.”

Theorem 7. *The log Gram-Charlier distribution is not determined by its moments. More precisely, there is a noncountable number of other distributions that have the same moments as any particular log Gram-Charlier distribution.*

*Proof. *There is a well-known way to construct a family of distributions that have the same moments as the lognormal (Feller [18], p. 227); the trick works for arbitrary parameters but, for simplicity, let ; that is, All the functions are nonnegative, integrate to 1, and have the same moments as (as a result of the symmetry of the standard normal distribution, after an obvious change of variable). Now, suppose (once again the same arguments work for other values of and ), and consider By the same change of variables used for the lognormal one immediately finds that integrates to one and has the same moments as . The only difference with the case of the lognormal is that is not necessarily nonnegative for . Two cases may arise. The first one is that the polynomial has no real zero. In that case its infimum is strictly greater than zero, and one can find a nonempty interval such that is nonnegative for all . In the second case has one or more zeros and the previous argument breaks down but can be modified to yield the same conclusion, if the function is replaced with where is a collection of intervals that include the zeros of , so defined that is not identically zero and satisfies (These two conditions are sufficient for to hold for .) The details are omitted. Another way to prove that the log Gram-Charlier distribution is moment indeterminate when has no zero (and thus the density does not take the value zero for any ) is to use a Krein condition (Stoyanov [19], p. 941), which says that a continuous distribution on with positive density is not determined by its moments if

#### 5. Option Pricing Formulas

The formulas below hold for any vector and thus extend those that have been derived by previous authors for the case where the log return has a distribution or a “squared” Gram-Charlier distribution (León et al. [11]). Schlögl [10] derives a formula equivalent to (90) but for densities expressed as an infinite Gram-Charlier series with .

As previous authors have done, we consider a market with a risky security and a risk-free security with annual return and suppose that the log return for period (denoted as ) has a Gram-Charlier distribution under the risk-neutral (or “pricing”) measure, which we denote as . The physical measure (usually denoted as ) is not specified (nothing says that the log return has or does not have a Gram-Charlier distribution under the physical measure).

The market model may have one or more periods, but since we consider the pricing of ordinary European puts and calls only the distribution of the log return for the whole period is required. For other types of options, in particular the applications presented in the next section, it may be necessary to use the one-period returns separately, as is done in Theorem 11.

The risky security has price at time 0, and

Theorem 8. *Suppose that a risky security pays dividends at a constant rate over and that the risk-free rate of interest is . Suppose also that under the risk-neutral measure the log return of the risky security over is , which satisfies the martingale conditionThen the time-0 price of a European call option with maturity and strike price iswhere , , and are given by (97), (101), and (105). The price of the corresponding European put is If , then the above simplify to *

*Proof. *Absence of arbitrage implies that , or This justifies (89). The price of the call is The second part is easier to deal with The probability of the event can be calculated explicitly by recalling that and using Theorem 3: whereTo calculate use the exponential change of measure formula, defining Then Since ,whereHence, Using