Abstract
This paper provides explicit formulae for the probability that an arithmetic or a geometric Brownian motion will not cross an absorbing boundary defined as a step function during a finite time interval. Various combinations of downward and upward steps are handled. Numerical computation of the survival probability is done quasi-instantaneously and with utmost precision. The sensitivity of the survival probability to the number and the ordering of the steps in the boundary is analyzed.
1. Introduction
The question of the first passage time of a diffusion process to an absorbing boundary is of central importance in many mathematical sciences. As mentioned in Wang and Pötzelberger [1], it arises in biology, economics, engineering reliability, epidemiology, finance, genetics, seismology, and sequential statistical analysis. Typically, one needs to compute the probability that some random dynamics modelled as a diffusion process will remain under or above some critical threshold over a given time interval. This is often referred to as a survival probability. Some papers focus on specific forms of the boundary, for example, a square root one (Breiman [2] and Sato [3]) or a curved one (Groeneboom [4] and Daniels [5]). Other contributions focus on a specific diffusion process, mostly Brownian motion (Park and Schuurmann [6], Jennen and Lerche [7], Salminen [8], Scheike [9], Novikov et al. [10], and Fu and Wu [11], 2010) or Ornstein-Uhlenbeck process (Alili et al. [12] and Lo and Hui [13]). Alternatively, some papers provide results for general classes of diffusion processes or boundaries (Durbin [14], Pötzelberger and Wang [15], Wang and Pötzelberger [1], and Downes and Borovkov [16]).
The contributions mentioned in this short noncomprehensive survey either focus on numerical algorithms usually involving recursive multidimensional quadrature or they seek to obtain approximate solutions by substituting the initial boundary with another one for which computations are easier and then deriving a bound for the error entailed by using the approximating boundary. This is because no closed form solution is known except in very few special cases when the boundary is linear and the underlying process is a Brownian motion. The main contribution of this paper is to provide a closed form solution to the problem of the computation of the survival probability of an arithmetic or geometric Brownian when the boundary is a step function of time. In general, this problem is numerically approximated by Monte Carlo simulation. The reason for this is twofold: from an analytical standpoint, the calculations involved are cumbersome and relatively complicated; from a numerical standpoint, the dimension of the integrals involved increases linearly with the number of steps in the boundary. In this paper, explicit formulae are provided up to five steps. An extension to higher dimensions is analytically straightforward. These formulae can be numerically computed quasi instantaneously with utmost precision in contrast to the slow and inaccurate approximations yielded by a Monte Carlo simulation. Nonmonotonic sequences of upper and lower steps are handled. The sensitivity of the survival probability with regard to the number and the ordering of the steps in the boundary is examined essentially as a function of the volatility of the underlying process. The problem of the location of the maximum or the minimum of Brownian motion with drift over several time intervals is also tackled. Although the purpose of this paper is not to study a specific engineering problem, one can mention that an immediate application of the results derived in this paper is the pricing and the hedging of many kinds of popular financial contracts known as barrier options and lookback options.
Section 2 provides a formula for the survival probability of an arithmetic Brownian motion when the absorbing boundary includes three steps. Section 3 is dedicated to numerical experiments whose purpose is to gain insight into the sensitivity of the survival probability to the number and the ordering of the steps. Section 4 extends the results of Section 2 to higher dimensions of the absorbing boundary. Section 5 provides a generalization to two-sided absorbing boundaries.
2. Formula for the Survival Probability of an Arithmetic Brownian Motion through a Sequence of Three Absorbing Steps
Let be a real constant and let be a positive real constant. Let be a standard Brownian motion. Under a given probability measure , let be the process driven byA finite time interval is considered and divided into subintervals , . Let us define a piecewise constant boundary made up of horizontal line segments matching time subintervals, in other words, a step function. These line segments are called the steps of the boundary. They take on constant values in each time subinterval , . The boundary may or may not be monotonically increasing or decreasing in time. When , the step is said to be upward, and when , it is said to be downward. When all the steps of the boundary are upward, the boundary itself is said to be upward and may also be called an upper boundary, while a downward boundary (also referred to as a lower boundary) is one whose steps are all downward.
The process is said to have survived at maturity when none of the steps has been hit at any moment in continuous time from to .
The tractability of the calculation of the survival probability depends on the number of steps in the boundary. Indeed, each additional step increases the dimension of the integration problem under consideration. We first come up with a general closed form solution for boundaries made up of three steps. Higher dimensions are considered in Section 3.
Let be defined as one of the six following cumulative distribution functions, where and , , are real constants: Then, we have the following.
Proposition 1. where if and zero otherwise; if and zero otherwise; if and zero otherwise; if and zero otherwise; if and zero otherwise, if and zero otherwise.And the function is the cumulative distribution function of three correlated standard normal random variables , , and with upper bounds and correlation coefficients between and , between and , and between and ; the numerical evaluation of this function can be carried out with double precision and computational time of approximately 0.01 second using the algorithm by Genz [17].
Proof of Proposition 1. , , and are absolutely continuous random variables. Moreover, is a Markov process. Hence, by applying the law of total probability and by conditioning, we havewhere , , , , , and and , and are jointly normal random variables and each pairwise correlation coefficient is equal to , , . Besides, the distribution of the maximum (or the minimum) of a Brownian motion with drift on a closed time interval, conditional on the two endpoints, is given byThus, expanding the trivariate normal probability density function, using (5) and substituting into (4), we obtainwhereIt is possible to solve this triple integral in closed form as a linear combination of sixteen trivariate standard normal cumulative distribution functions and this yields Proposition 1. The details are quite cumbersome, so they are omitted. However, it is easy to verify that a Monte Carlo simulation will produce numbers that slowly converge to the exact values provided by Proposition 1 whatever the parameters considered. It is highly recommended to use a conditional Monte Carlo approach inasmuch as the latter enables approximating the value of the integral in (6) by randomly drawing three and only three values of the process at each run, without having to discretize the time interval; for some background on the conditional Monte Carlo approach, also known as Brownian Bridge Monte Carlo, the reader is referred to Dagpunar [18]. After implementing conditional Monte Carlo and drawing random numbers from the Mersenne Twister generator, we observed that at least 20,000,000 simulations were required to be sure to attain a minimum level of convergence with the values provided by Proposition 1. These empirical findings were obtained out of a sample of 1,500 lists of randomly drawn parameters used as inputs to Proposition 1. They are reported in more detail in Table 1.
Before turning to Section 3, one may notice that the following combinations of steps are not handled by Proposition 1: This is not because (8) and (9) are more difficult to obtain using the method outlined in the proof of Proposition 1 but simply because the resulting formulae involve twice as many multivariate distribution functions as in the statement of Proposition 1, which would make the latter very bulky. Closed form formulae for (8) and (9) are available from the author upon request.
3. Numerical Experiments
Let us consider a geometric Brownian motion driven bywhere is a constant, is a positive constant, and is a standard Brownian motion.
Without loss of generality, this process is assumed to start at level at time . We first deal with sequences of maxima or minima, that is, -type and -cumulative distribution functions. We are interested in the following survival probabilities:An elementary application of Ito’s lemma to shows that (9) and (10) can be obtained by computingwhere and is driven byThus, Proposition 1 can be used to compute exact survival probabilities of the geometric Brownian motion .
The boundaries under consideration are said to “widen” on a given time interval when the current step is located farther from the initial process value than the previous step. Likewise, the boundaries are said to “shrink” on a given time interval when the current step is nearer to the initial process value than the previous step. Our objective is threefold:(i)To measure the effect of widening the boundary at one or two steps, compared with keeping a fixed boundary until maturity.(ii)To measure the effect of ordering the steps of the boundary in different ways; in particular, we compare survival probabilities when the farthest step is located in the last time interval and the nearest step is located in the first time interval.(iii)To compare, conversely, survival probabilities when the boundary is upward and when it is downward.For the latter comparison to be unbiased or neutral, we select upward and downward steps whose distances to the origin are the same. Additionally, the probability of not hitting a boundary is a function of the deterministic part of the stochastic differential equation driving the process: a positive drift coefficient increases the chances of hitting an upward boundary, while a negative drift coefficient increases the chances of hitting a downward boundary. That is why we set symmetric values of of 3% and % when examining sequences of maxima and sequences of minima, respectively.
Three levels of the dispersion coefficient are selected: 18% (defining a “low volatility regime”), 36% (defining an “intermediate volatility regime”), and 64% (defining a “high volatility regime”).
Numerical results are reported in Tables 2 and 3.
First, it can be noticed that the increase in the survival probability entailed by a widening of the boundary is not symmetric whether one deals with an upward or a downward boundary. In a low volatility regime, the probability of remaining above a downward boundary is higher than the probability of remaining below an upward boundary. The reverse holds in a high volatility regime. In the intermediate volatility regime, the result depends on the order of the steps of the boundary.
It must be stressed that not all these observations are intuitive. Recall that the drift and the dispersion coefficients of a geometric Brownian motion such as are proportional to at any given , so that the drift coefficient of is equal to . When , we have if as in Table 2 and if as in Table 3. The deterministic part of the instantaneous dynamics thus drives the process downward in the high volatility regime, so that it decreases the likelihood of crossing an upward boundary and increases the likelihood of crossing a downward boundary. It is therefore not surprising to observe that the probability of remaining below an upward boundary is higher than that of remaining above a downward boundary in the high volatility regime, all the more so as is more negative in Table 3 than in Table 2. However, the observation that one can make in the low volatility regime may seem counter-intuitive. Indeed, when , we have if and if . The deterministic part of the instantaneous dynamics thus drives the process upward in Table 2, which has a negative effect on the probability of survival; but the process is driven downward in Table 3, which also creates a negative effect on the probability of survival and at a faster rate than that with which the process is driven upward in Table 2. Hence, from an intuitive point of view, one would expect the probability of remaining below an upward boundary to be higher than that of remaining above a downward boundary in the low volatility regime too.
Second, one can observe that widening the boundary has a much more significant impact when the farthest step is located in the last time interval if volatility is low, whether one deals with an upward or a downward boundary, while the reverse is true in a high volatility regime; that is, the probability that the process will have survived at time is higher when the farthest step is set in the first time interval. This phenomenon can be explained by the fact that the chances of hitting the boundary are already substantial early on in the life of the process when volatility is high, so that widening the boundary at the first time interval exerts a “strong” positive impact on the probability of survival, whereas the risk of absorption only gradually increases over time when volatility is low, so that widening the boundary at an early stage has less effect on the probability of survival. However, the magnitude of this phenomenon in a high volatility regime is much larger when the boundary is upward rather than downward, which suggests that when volatility is high, the risk of hitting the boundary at an early stage of the process is greater when the boundary is upward. Interestingly enough, the probabilities of survival are very close in Table 3 when the farthest step of the two-step boundary is located in the first or in the last time interval and volatility is high.
Another noticeable fact is the fact that the chances of survival rise when setting two gradually increasing or decreasing steps rather than one much higher or much lower step whatever the level of volatility. More precisely, the survival probability with a two-step boundary is lower than that with a three-step boundary whether the boundary is widening or shrinking even as we consider an identical weighted average of the steps in both cases; in other words, it is “safer” to go through the set of steps rather than through the set of steps and it is “safer” to go through the set of steps rather than through the set of steps , although the average (weighted by each time interval) is 133 + 1/3 in both sets; similarly, it is safer to go through the set of steps rather than through the set of steps and it is safer to go through the set of steps rather than through the set of steps , although the weighted average is 73 + 1/3 in both sets. The differences between survival probabilities as a function of the number of steps in the boundary are smaller when volatility is high than when it is intermediate or low.
These observations suggest that we should use Proposition 1 to compute the probability that the maximum (or the minimum) reached by the process will lie in the first, the second, or the third time interval, as a function of volatility, considering three time intervals of identical length (otherwise, it is easily anticipated that any extremum will more likely be reached in any time interval (i.e., significantly larger than the others)). The probability that the maximum will lie in the first time interval can be obtained by computingThe trivariate density function inside this integral is obtained by differentiating Proposition 1 three times. This operation is tedious but straightforward and can be dealt with painlessly by using standard scientific computing software. The probabilities that the maximum will lie in the second or in the third time interval can be computed likewise in an obvious manner. Numerical results are reported in Table 4, using parameters consistent with Table 2. The quantity is defined as the global maximum; that is,Conditional expected values of , obtained by Monte Carlo simulation, are also reported.
We can see that, whatever the level of volatility, the probability that is reached between and , that is, in the middle time interval, is approximately the same. It is also always the least likely outcome. In contrast, the probability that is reached between and , that is, in the first time interval, steadily increases with , while the probability that is reached between and , that is, in the last time interval, steadily decreases with . When volatility is low, the most likely outcome is that will be reached in the last time interval. This result is quite intuitive, as low volatility implies that it takes time for the process to hit large positive values and that the chances that the process will plummet at an early stage are thin. When volatility is high, the most likely outcome is that will be reached in the first time interval. This could be because of the fact that as volatility rises the likelihood that the process will hit lower and lower values over time increases, so that an increasingly smaller fraction of paths will have enough time to make the journey back to highly positive territory. This explanation is supported by the distribution of the location of in space. Indeed, Table 4 shows that when is reached in the first time interval, the expected value of is always smaller than that when is reached in the second time interval or in the third time interval, whatever the level of volatility, which implies that the occurrence of the event is positively correlated with the number of paths spending “much” time below between and .
These various phenomena are confirmed and even more pronounced on longer time horizons, especially in a high volatility regime, where is by far the most likely event, as shown by Table 5. The proportion of paths that verify remains noticeably stable, whatever the volatility level and the time horizon. Not surprisingly, the expected values of rise significantly on longer time horizons.
So far, the emphasis has been laid on volatility as the fundamental variable, but one might object that it is not so much as the ratio that matters. In Table 6, the value of is significantly raised from to . It is clear that the pattern observed in Tables 4 and 5 is not altered but merely shifted, with bigger absolute values for relative to ; just as in Tables 4 and 5, the probability that is reached in the first time interval steadily increases with , the probability that is reached in the last time interval steadily decreases with , and the probability that is reached in the middle time interval is strikingly stable.
Finally, in Table 7, we take a brief look at the impact of combining both downward and upward steps in the boundary. Let us denote by the set of probabilities of survival under a sequence of minima or maxima, that is, the set including and , and let us denote by the set of probabilities of survival under a sequence of minima and maxima, that is, the set including , , , and . It is noticeable that when volatility is low, the elements in are never very different from those in , roughly speaking, while the value of the elements in plummets versus that of the elements in in a high volatility regime. The reason for the latter phenomenon is the fact that when volatility is high, the vast majority of the paths that never cross until time are those that spend a lot of time in regions far beneath the starting point , so that there is little chance that these paths will always be above between and . Most of them will actually lie under at time . The same is true, to a slightly lesser extent, of the paths that never cross until time . Similarly, in a high volatility regime, the vast majority of the paths that never cross until time or even are those that spend a lot of time in regions far above the starting point , so that there is little chance that these paths will always be below between and or even between and .
The moderate differences between the elements in and those in when volatility is low can be explained by the fact that, in such an environment, the probability of survival is high anyway, whatever the number and the location of the steps of the boundary. However, it is interesting to notice that is smaller than and . Indeed, we know from Tables 2 and 3 that, in a low volatility regime, the probability of remaining above a downward boundary is higher than the probability of remaining below an upward boundary; that is, we have . All the more so in Table 7, since the positive drift coefficient creates an upward bias that decreases the likelihood of crossing a downward boundary. Since downward steps span two-thirds of the total time interval in versus one-third in both and the inequalities and show that the order in which steps with opposite direction are introduced matters more than the weight of downward steps versus upward steps; otherwise, we would have the reverse inequalities and . In the high volatility regime, the order in which steps with opposite direction are introduced is relatively negligible for those boundaries where downward steps outweigh upward steps, that is, for and , while it has a considerable impact on those boundaries where upward steps outweigh downward steps, that is, for and . All these properties cannot be easily “guessed” or anticipated and require an exact formula such as Proposition 1 to be established.
4. Extension to Higher Dimension
The kind of analytical calculations that led to Proposition 1 can be carried out in dimensions greater than three. Similar to the approach laid out in the proof of Proposition 1, for any number of upward steps, the survival probability of a process with dynamics defined by (1) is given by the following -dimensional integral:where the function , is the cumulative distribution function associated with the -variate normal random vector and is the following product of intervals: The question is, can this integral be solved as a linear combination of functions that are computable with adequate accuracy and efficiency? Although the analytical side of the problem is not trivial, the main issue is numerical. Indeed, an explicit solution involves a combination of a number of -dimensional Gaussian integrals. Currently, there is no algorithm capable of evaluating the multivariate normal cumulative distribution function efficiently and with arbitrary precision as soon as the dimension of the normal integral is greater than or equal to 4. However, it can be noticed that, as a consequence of the Markov property of the process , the joint density function associated with the -variate normal random vector simplifies towhich is simpler than the standard -dimensional normal density function for computational purposes.
The integral to be solved becomes This new representation makes calculations easier both from an analytical perspective and from a numerical perspective. A general solution for any remains intractable, but it becomes possible to achieve a closed form solution for moderate without too much effort, although the resulting formulae will be rather bulky. In particular, for , one can obtain the following proposition.
Proposition 2. Given and , , , , and , we have