Abstract
Financial markets are complex processes where investors interact to set prices. We present a framework for option valuation under imperfect information, taking risk neutral parameter uncertainty into account. The framework is a direct generalization of the existing valuation methodology. Many investors base their decisions on mathematical models that have been calibrated to market prices. We argue that the calibration process introduces a source of uncertainty that needs to be taken into account. The models and parameters used may differ to such extent that one investor may find an option underpriced; whereas another investor may find the very same option overpriced. This problem is not taken into account by any of the standard models. The paper is concluded by presenting simulations and an empirical study on FX options, where we demonstrate improved predictive performance (in sample and out of sample) using this framework.
1. Introduction
Mathematical models are used in the financial industry for prediction and risk management. The quality of the models is crucial—during summer 2007, media reported
We are seeing things that were 25-standard deviation events, several days in a row. David Viniar, Goldman Sachs CFO.
The Goldman Sachs GEOfund lost 30% of its value in a week, due to rare events. Assessing and controlling these risks is of vital interest to avoid unpleasant surprises. A 25-standard deviation event should almost never occur if the data generating process is Gaussian (the probability of an event of this size or larger is roughly ) but will occur from time to time if the model is heavy tailed.
Mathematical models are either parametric or nonparametric. Parametric models are the dominating approach, as these are easier to analyze and easier to fit to data. A limitation (and simultaneously the strength) of the parametric models is their limited flexibility, resulting in low variance and some bias; whereas nonparametric models are flexible and less biased but often poor (highly variable) predictors (cf. [1]).
Popular parametric option valuation models include the Black and Scholes model [2], the Merton jump diffusion [3], the Heston stochastic volatility [4], the Bates stochastic volatility jump diffusion [5], exponential Lévy processes such as Variance Gamma [6], Normal Inverse Gaussian [7], CGMY [8], and stochastic intensity models [9, 10].
However, it was claimed by [11] that even advanced models cannot explain all features in data (in the volatility surface), suggesting that additional or alternative modeling is needed. This view is shared by [12] where it is concluded that transaction costs are the key determinant of the curvature of the smile.
Parametric models are calibrated to data by estimating the parameters, often by minimizing some loss function, (cf. [13]) or through nonlinear Kalman filters; see [14]. It is easily argued that inaccurate calibration methods can cause problems. Inefficient estimators imply that another estimator would give better estimates. Oversimplified models will also cause problems, directly by providing biased forecasts and indirectly by having large parameter uncertainty due to large residual variance while over parameterized model will overfit data. Estimators such as Gaussian Maximum Likelihood estimators minimize the mean square prediction errors, trading bias for variance to achieve this.
Several calibration methods are primarily designed to estimate the volatility, although other parameters may also be of interest. Real-world calibration is complex as several subjective choices need to be made. Two things are needed, an estimator and a set of data. Starting with the estimators, we have to choose between the following.(i)Historical volatility, which is the standard (MLE) volatility estimator, scaled to account for the sampling frequency. Statistical theory (Cramer-Rao bounds, etc.) suggests that this should be the optimal estimator, given that the volatility is constant over time. More recent variation on this theme includes realized volatility and Bipower variation (cf. [15]). (ii)Time series models, such as ARCH/GARCH models [16], stochastic volatility models, or EWMA filters. These provide volatility forecasts that capture temporal variations in the volatility. (iii)Implied volatility is a common name for estimating the volatility from quoted options rather than from the underlying asset. The simplest implied volatility estimator is found by inverting the Black and Scholes formula, while other estimators such as VIX use a combination of prices having different moneyness and time to maturity to estimate the volatility.
Studies have shown (cf. [17]) that implied estimators often outperform all other estimators, even though recent realized volatility estimators are increasing efficient and may also provide good estimates. Reference [17] explained their findings by the fact that implied estimators look forward in time; whereas other estimators extrapolate from historical data. Another explanation is the higher quality of the data: a single option provides a reasonable estimate of the volatility while historical estimators require large data sets to provide good estimates.
The purpose of the estimators is also important as most estimators will only estimate either the objective measure or the risk-neutral measure , and both are needed when hedging options in incomplete markets.
Another, but related problem, is selection of data. Several factors will influence the result. (i)The sampling frequency can be of paramount importance! Data sampled at higher frequencies should in theory give better estimates, but market microstructure (e.g., ask-bid spreads) tends to invalidate some of the gain. A related problem is that different time scales have different dependence structures. The correlation structure in high frequency data is sometimes claimed to be similar to long range dependence, while correlation structure in daily or weekly data is ordinary (e.g., exponentially decaying). (ii)The size of the estimation window can influence the results. Restricting the data set to recent data will lead to noisy estimates, while including too much historical data leads to bias and difficulties to track market variations.
It is highly unlikely, taking different estimators and data sets into consideration, that all investors are using identical estimates, thereby causing the “market parameters” to be unknown.
The purpose of this paper is to value options under parameter uncertainty. Reference [18] studies model uncertainty, which is related to parameter uncertainty. The primary purpose of their paper is not to price the model risk, but rather quantify the size of the risk. We believe that it is of importance to value the parameter uncertainty, for example, when computing hedges.
Valuation of options under parameter uncertainty was treated in [19, 20]. Both papers use a Bayesian framework to compute the posterior distribution of prices. However, their approach is purely statistical (the expectation is taken over the objective, distribution) and is not based on financial theory ( distribution). Averaging over the -distribution when the parameters should be used could easily result in biased. Still, their work is important as model averaging usually improves predictive performance (cf. [21]).
Reference [22] introduces stochastic parameters as a method of improving the fit of basic models. Their resulting valuation formulas are similar to what we derive for simple (Black and Scholes-like) models.
It is organized as follows. In Section 2, we review the basics of risk-neutral valuation framework. In Section 3, we proceed by suggesting a modification to the standard risk neutral valuation, and Section 4 presents some simulations in this framework. Section 5 provides an empirical study on FX options, and Section 6 concludes the paper.
2. Valuation of Options
The basis for valuation of contingent claims is the risk neutral valuation formula; see [23]. Let be a filtered probability space on which a stochastic process is defined. The stochastic process is adapted to the natural filtration augmented with the -null sets . The process is used to model the absolute price of the underlying asset.
Relative values are often more useful than absolute values, and hence is a numeraire introduced. The standard choice for options on equity is a risk free bank account, modeled as where is the time varying risk free interest rate. FX models may use the domestic bank account as numeraire, when pricing options on foreign currency. The market is then made up of . The bank account is not a traded asset in practice, and it is often convenient to switch between the bank account and zero coupon bonds in the derivations. These are equivalent for deterministic rates, and the extension can also be done for stochastic rates.
An important theoretical object when valuing options is equivalent probability measures, for example, probability measures such that the null sets for the measures and coincide, . An important class of equivalent probability measures is equivalent martingale measure, defined as equivalent probability measures satisfying or, equivalently
A basic rule of thumb, see [23], states that the existence of an equivalent martingale measure is a sufficient condition for the market to be free from arbitrage, and uniqueness of is a sufficient condition for perfect replication of any option, using dynamical trading in the underlying instrument and the numeraire. Knowing the replicating portfolio eliminates any uncertainty regarding the value.
Options in complete or incomplete markets are valued using the risk neutral valuation formula. The value of a European option, having contract function , is given by for any risk neutral measure . Values are in general not unique as any measure generates option values without any internal mispricing (arbitrage).
Reference [24] showed that all pricing rules that fulfill some axiomatic consistency conditions can be expressed as discounted conditional risk neutral expectation, regardless of the model used. The work in this paper is therefore based on the risk neutral valuation formula.
3. Parameter Uncertainty
The valuation theory in Section 2 is derived under perfect information whereas real-world investor faces imperfect information. This complicates the situation, and we approach this with a simplified example.
Example 3.1. Consider a market consisting of many different investors. They will, based on the discussion above, use different models and data set to form trading strategies.
We simulate their behavior by assuming that all investors are using the Heston model, but they differ in terms of data. Specifically, we use USD/EURO FX option data, where each investor is using 50% (randomly selected) of the available FX options quotes. Their resulting implied volatility surfaces are presented in Figure 1.
The first principle for any investor is to buy when prices are low and sell when prices are high. Thus will some investors find ITM and OTM option expensive, while other investors will find the opposite (they agree pretty much on the price of ATM options). The investors will therefore trade until they no long find any of the options mispriced, that is, when the prices have stabilized somewhere between their calibrated valuations.

What happened in Example 3.1 is an effect of imperfect information—investors are using different risk-neutral measures, and different measure can generate similar volatility surfaces. This effect will be explored further by studying filtrations.
3.1. Filtrations
Filtrations are loosely speaking the information generated by the stochastic process.
Definition 3.2. The market filtration is defined as the natural filtration generated by the stochastic process (S), augmented by the null sets
The market filtration is used in the theoretical valuation framework but is not available to the investors. It contains path-wise, continuous time information which corresponds to an infinite sample size when estimating volatilities.
The observed process is the market process observed at discrete time . It was shown by [25] that it is statistically optimal not to use tick data but to sample less frequently (say every 30 minutes) to suppress noise due to market microstructure when estimating volatility. Later research suggests that subsampling and averaging can suppress some of the microstructure. However, markets does not trade continuously, making (in this setting) the discussion whether we can suppress most of the microstructure or not irrelevant!
Definition 3.3. The observed filtration is defined as the natural filtration generated by the sampled stochastic processes, augmented by the null sets
The sampled stochastic process is a discrete time process while the market process is a continuous time process. It is obvious that . Note that the standard valuation formula is using information () not available to investors ().
Remark 3.4. We can augment the market and observed filtration with option data, but this does not change the fact that the market filtration is generated by a continuous process and the observed filtration is generated by a discrete process.
3.2. Revised Valuation Formula
The risk-neutral valuation formula was revisited in Section 2. Reference [24] has shown that model-free valuation can always be expressed as discounted, conditional expectation with respect to some equivalent martingale measure. This result will play an important role in the remainder of the paper.
Parametric models generate a risk neutral distribution, conditional on the filtration and the fixed parameters . This distribution used in all parametric valuation models, and we write option values computed using this parametric distribution as
This valuation formula does not fit into the framework of [24] as the expectation is taken conditional on filtration and parameters. The problem we experienced in Example 3.1 was that parameters are not known without errors, and we should therefore take parameter uncertainty into account. This is done be interpreting the parameters as a random vector .
Keeping in mind that the solution must satisfy the results in [24] restricts the class of solutions to conditional expectations with respect to some equivalent martingale measure.
Lemma 3.5. Let be the market and the observed filtration. The value of an option having contract function based on the sampled process is given by
Proof. The result follows immediately from [24, Theorem ].
Theorem 3.6. The value of an option is given by where is the value of the option conditional on the filtration and the parameters .
Proof. This follow from Lemma 3.5, the law of total probability and Fubini's theorem (cf. [26]),
This result states that a fair value is a risk-neutral consensus of what different investors are prepared to pay, compare with Example 3.1.
Why is this not noticed in the Black and Scholes model? Several authors have in fact suggested adjustments similar to (3.5), compare with [19, 20, 22]. Still, let us see what happens when valuing a European call option in a Black and Scholes model.
Example 3.7. The Black and Scholes model uses a Geometric Brownian motion to model the underlying asset, while the numeraire is a bank account. Let us start by assuming that we observe the market filtration, . It contains enough information the estimate (using quadratic variation) without errors, thus . Applying the modified risk-neutral valuation formula (3.5) gives
where is the Black and Scholes formula. This result is not surprising since the market is complete, and no parameter uncertainty is present. The replicating hedge will then correspond to a unique price.
Changing the filtration to the observed filtration breaks this property. The distribution will not be a point mass as cannot be estimated without errors. Instead, we get
for which any numerical approximation is a mixture of Black and Scholes models compare with [22],
Example 3.7 showed that the Black and Scholes model does not have any internal inconsistencies when then market filtration is being used while some inconsistencies are found when using the market filtration.
Remark 3.8. The value of options in a stochastic volatility framework, see, for example, [27], where the correlation between the volatility and the underlying is zero, can be written similarly to (3.5). The value is then given by
Moving on to more advance models reveals the general structure. We take the Merton model as our test model.
Example 3.9. The Merton model was introduced in [3]. The model consists of a diffusion term (similar to the Black and Scholes model) and a jump term, where jumps are log normal and arrive according to a Poisson process. Statistical tools such as realized volatility can provide estimates of the quadratic variation, while tools such as Bipower variation can provide estimate of the continuous component of the quadratic variation (cf. [15]). It is therefore possible, as in the Black and Scholes model, to estimate the parameters in the diffusion term without errors using the market filtration.
But this does not hold for the jump component, as only a finite number of jumps are observed (there is even a nonzero probability that no jumps are observed!). It is therefore impossible to estimate parameters related to the jump measure without errors using the market filtration! Consequently
The less informative observed filtration will only emphasize this difference, as we no longer can observe the number of jumps with probability one, obscuring the separation of the jump component from the diffusion component.
Infinite activity Lévy processes is another popular class of processes. It is tempting to assume that these models avoid the difficulties associated with the Merton model. This is not true, as only infinitely small jumps can have infinite activity. Any Levy process can be decomposes into a term with jumps smaller than some constant and another term containing all large jumps. The Asmussen-Rosinski approximation, see [28], can be applied to a large subclass of infinite activity models, and it gives conditions when the small jump term can be approximated with a diffusion term, but this will only bring the class of models back a jump diffusion type model.
4. Simulations
We explore the effects of the new valuation framework in the Black and Scholes model and the Merton model. These scalar models are well known and can easily be generalized within the framework.
The risk neutral expectation over the parameter space is computed using Monte Carlo simulation, each simulation using 1000 samples. Common random numbers have been used when possible. Parameters with support on are left unchanged where while parameters with support on are transformed using a logarithmic transformation These random variables have expectation and variance that increase with .
The variance is varied over a range of values for all parameters in order to study the effects of the parameter uncertainty. We have computed the modified option values by having only one parameter stochastic at a time, keeping all other parameters deterministic. This is done to make the results easier to interpret, and it would be trivial to extend the simulations to having all parameters stochastic.
4.1. Black and Scholes
It was argued in [29] that the distribution of the volatility is almost log-normal, making (4.2) our choice of parameterization. Other reasonable distributions are or even Normally distributed if the variance is small,compare with (4.1) (the asymptotic distributions for most estimates are Gaussian due to the central limit theorem).
The market is simulated with the initial value of the underlying being , strikes ranging from in steps of and time to maturity varying from in steps of . The volatility is chosen as and the risk free rate is . Finally, prices were then computed using (a), (b), (c), and (d); all results are presented in Figure 2.

(a)

(b)

(c)

(d)
It can be seen that large parameter uncertainty generates a significant volatility smile, and also a noticeable term structure.
4.2. Merton
We use the same market when computing prices in a Merton framework. The Merton model is parameterized by setting the volatility , the jump intensity , the expected jump size , and jump size volatility .
The option values and correspond implied volatilities were computed using ((a), (b), (c), and (d)), ((e), (f), (g), and (h)), ((i), (j), (k), and (l)), and ((m), (n), (o), and (p)), all results are presented in Figure 3.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)

(p)
The differences between the standard Merton implied volatility and the modified Merton models are presented in Figure 4.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)
It can be seen that adding parameter uncertainty to the diffusion generates a volatility smile, much as it did for the Black and Scholes model. The jump intensity does not seem to make much difference, and the same holds for the jump size volatility . Introducing parameter uncertainty to the jump size makes a notable difference to the implied volatility surface. The change is dominated by an upward shift of the volatility surface, similar to what an increase of the (deterministic) diffusion parameter would give. We therefore recommend that only is introduced as uncertain parameter for the Merton models in order to avoid identifiability problems, compare with [1].
5. Empirical Study
The simulation study indicated that taking uncertain parameters into account adds features to the volatility surface of the model. This section will analyze whether it makes any difference in practice.
We use weekly quoted FX options, written on the USD/EURO exchange rate, quoted from January 7th, 2004 to 30th January, 2008. The data includes options having 1 week, 1 month, 3 months, 6 months, 1 year, and 2 years time to maturity, often with several different strikes for each time to maturity.
Parameters can be estimating using nonlinear least squares, minimizing the sum of the squared difference between the observed price and the predicted price
This calibration method does usually give highly variable estimates. The variability can be reduced by adding a penalty, where we used a quadratic (ridge regression type) penalty defining the calibration problem as The first 20 weeks are used as a training data set in order to obtain good initial parameter estimates, where after the estimate for the previous week () is being used as the reference parameter for the current estimate ().
The fit was evaluated using the Mean Absolute Error (MAE) and the Mean Squared Error (MSE) computed using all options in the validation set.
5.1. In-Sample Results
We have calibrated the Black and Scholes model (with and without uncertain volatility), the Merton model (with and without uncertain volatility), and the Heston stochastic volatility model to our data. The results based on the in-sample residuals are presented in Table 1.
It can be seen that the Black and Scholes model is the least accurate model in sample, and the Merton with uncertain volatility is the most accurate. The Heston stochastic volatility model is similar to the Merton model. Models with uncertain parameter provide better fit than their standard counterparts, and this was expected as additional parameters are available to improve the fit.
5.2. Out-of-Sample Results
Getting a good fit in-sample rsults only requires sufficiently many parameters. A more interesting test is obtained from out-of-sample fit, here stepwise evaluating the current fit using the parameters obtained from historical data . The results are presented in Table 2.
These results are consistent with the results from the in-sample calibrations. Models with uncertain parameters provide better fit to data, both in sample and out of sample.
6. Conclusions
We have introduced a framework, based on mathematical and financial theory, for including risk neutral parameter uncertainty when valuing contingent claims. Some of these ideas have been known for some time for simple models, such as the Black and Scholes model.
The framework extends all existing models by computing the market value as a risk neutral expectation taken over the parameter space, that is, as the average price for a set of different parameters. This corresponds to valuing options as the consensus of what different investors are prepared to pay.
The resulting valuation formula was shown to generate better fit to real FX data than their standard counterparts, both in sample and out of sample.
Acknowledgments
The financial support from the Bank of Sweden Tercentenary Foundation under Grant P2005-0712 is gratefully acknowledged. The author would also like to thank the anonymous referees for comments and suggestions.